Efl.Model.Container and Efl.Model.Item to efl/interfaces are used
to create Efl.Model objects with predefined property values.
This is useful to any situation where we want an Efl.Model with
explicit defined property values.
Efl.Ui.View and Efl.Ui.Factory are used to connect Efl.Models with
Widgets, Elm.Layout and Efl.Ui.Image has changed to use news interfaces
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
This condition can never be true. It can't be NULL here. A NULL here
would have caused a crash earlier, though it can only be NULL if an
allocation fails, which is something that we don't really handle
for smallish allocations.
CID1366823
Since we keep a log of created and deleted objects, we can walk the
log and see which were leaked. As this is expensive, do only if log
level is greater than 3 (INFO, DEBUG...), with backtrace of object
creation being displayed as backtrace if running as level 4 (DEBUG).
This fixes an issue where efl_isa() wouldn't work for extensions or
ancestors of extensions of a class.
Example:
Class A implements interface F2
F2 inherits from interface F1
obj is of class A
Before this patch efl_isa(obj, F1) would return false, now it returns
true as expected.
This is just one example, there is a whole array of variations to this
issue that are now fixed.
Thanks to Gustavo for reminding me of this.
@fix
Tom is worried about performance hit (god, checking a bit in a pointer
we'll fetch to memory anyway, since we return it masked), so guard
under EO_DEBUG.
In C we need this to make clear that we really do not accept parameters.
Found by the smatch source code matcher. I had run and fixed this before
but it seems to creep in again over time.
When writing this ERR log I thought about "thread" (it's really
the keyword here) but eventually reworded to "context". Let's be
clearer about the possible issue here.
_efl_object_api_op_id_get() will query a hash for the given pointer,
however if it wasn't populated, it will return "NOOP" and we're
hopeless while debugging on what happened.
Common case is to use the incorrect method, like:
obj = efl_add(CLS1, ...);
cls2_method(obj);
Since we did not create CLS2, it won't populate its methods on the
hash, thus the lookup will return NOOP.
With this change the function now gets the target object and function
name so reports an insightful message such as:
ERR:eo file.c:123 cls2_method() Unable to resolve op for api func 0x7ff492ddea00 for obj=0x400000007e8ee1df (CLS1)
Eo pointer indirection is super nice as it avoids you to access
invalid memory, but this extra checks inhibits valgrind's own tracking
of memory lifecycle, usually it would report when the object was
created and when the object is deleted, both as stack traces.
This commits introduces logging of object creation and destruction
under its own eina_log_domain and controlled by EO_LIFECYCLE_DEBUG and
EO_LIFECYCLE_NO_DEBUG envvars. These will only be available if
compiled with EO_DEBUG, thus shouldn't cause any performance hits on
production code.
Running a bogus app with invalid efl_class_name_get() and double
efl_del() will report as below:
```sh
$ export EO_LIFECYCLE_NO_DEBUG=Efl_Loop_Timer,Efl_Promise,Efl_Future
$ export EO_LIFECYCLE_DEBUG=1
$ export EINA_LOG_LEVELS=eo_lifecycle:4
$ /tmp/bogus_app
DBG:eo_lifecycle lib/eo/eo.c:2712 _eo_log_obj_init() will log all object allocation and free
DBG:eo_lifecycle lib/eo/eo.c:2788 _eo_log_obj_init() will NOT log class 'Efl_Future'
DBG:eo_lifecycle lib/eo/eo.c:2788 _eo_log_obj_init() will NOT log class 'Efl_Promise'
DBG:eo_lifecycle lib/eo/eo.c:2788 _eo_log_obj_init() will NOT log class 'Efl_Loop_Timer'
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35a1aa0 obj_id=0x4000000002cf38ef class=0x563fa35a1450 (Efl_Vpath_Core) [0.0004]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35af8d0 obj_id=0x4000000006cf38f0 class=0x563fa35aecf0 (Efl_Loop) [0.0005]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35d61a0 obj_id=0x400000007ecf390e class=0x563fa35d48f0 (Efl_Net_Dialer_Simple) [0.0054]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35d6470 obj_id=0x4000000082cf390f class=0x563fa35d0d60 (Efl_Net_Dialer_Tcp) [0.0055]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35d75b0 obj_id=0x4000000086cf3910 class=0x563fa35d66b0 (Efl_Io_Queue) [0.0056]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35d8f70 obj_id=0x400000008acf3911 class=0x563fa35d7860 (Efl_Io_Copier) [0.0057]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35df980 obj_id=0x40000000a6cf3918 class=0x563fa35d66b0 (Efl_Io_Queue) [0.0058]
DBG:eo_lifecycle lib/eo/eo.c:2665 _eo_log_obj_new() new obj=0x563fa35dfc30 obj_id=0x40000000aacf3919 class=0x563fa35d7860 (Efl_Io_Copier) [0.0058]
will efl_class_name_get() with invalid handle:
ERR:eo lib/eo/eo.c:1013 efl_class_name_get() Class (0x2000000000000029) is an invalid ref.
ERR:eo_lifecycle lib/eo/eo.c:1013 efl_class_name_get() obj_id=0x2000000000000029 was neither created or deleted (EO_LIFECYCLE_NO_DEBUG='Efl_Loop_Timer,Efl_Promise,Efl_Future').
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35df980 obj_id=0x40000000a6cf3918 class=0x563fa35d66b0 (Efl_Io_Queue) [0.0061]
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35dfc30 obj_id=0x40000000aacf3919 class=0x563fa35d7860 (Efl_Io_Copier) [0.0061]
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35d75b0 obj_id=0x4000000086cf3910 class=0x563fa35d66b0 (Efl_Io_Queue) [0.0061]
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35d8f70 obj_id=0x400000008acf3911 class=0x563fa35d7860 (Efl_Io_Copier) [0.0061]
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35d6470 obj_id=0x4000000082cf390f class=0x563fa35d0d60 (Efl_Net_Dialer_Tcp) [0.0063]
DBG:eo_lifecycle lib/eo/eo.c:2688 _eo_log_obj_free() free obj=0x563fa35d61a0 obj_id=0x400000007ecf390e class=0x563fa35d48f0 (Efl_Net_Dialer_Simple) [0.0063]
will double free:
ERR:eo ../src/lib/eo/efl_object.eo.c:78 efl_del() EOID 0x400000007ecf390e is not a valid object. EOID domain=0, current_domain=0, local_domain=0. EOID generation=2cf390e, id=1f, ref=1, super=0. Thread self=main. Available domains [0 1 ]. Maybe it has been deleted or does not belong to your thread?
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() obj_id=0x400000007ecf390e created obj=0x563fa35d61a0, class=0x563fa35d48f0 (Efl_Net_Dialer_Simple) [0.0054s, 0.0009 ago]:
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6d0ea: libeo_dbg.so+0x90ea (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6ca62: _efl_add_internal_start+0x1c2 (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x00563fa15dc95f: bogus_app+0x295f (in /tmp/bogus_app 0x563fa15da000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0ace7291: __libc_start_main+0xf1 (in /usr/lib/libc.so.6 0x7f2c0acc7000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x00563fa15dc48a: _start+0x2a (in /tmp/bogus_app 0x563fa15da000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() obj_id=0x400000007ecf390e deleted obj=0x563fa35d61a0, class=0x563fa35d48f0 (Efl_Net_Dialer_Simple) [0.0063s, 0.0000 ago]:
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6d8ba: libeo_dbg.so+0x98ba (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6d711: libeo_dbg.so+0x9711 (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6beb8: libeo_dbg.so+0x7eb8 (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc6c06e: _efl_object_call_end+0x4e (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0bc75725: efl_del+0x105 (in src/lib/eo/.libs/libeo_dbg.so 0x7f2c0bc64000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x00563fa15dcd54: lt-efl_net_dialer_simple_example+0x2d54 (in /tmp/bogus_app 0x563fa15da000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x007f2c0ace7291: __libc_start_main+0xf1 (in /usr/lib/libc.so.6 0x7f2c0acc7000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() 0x00563fa15dc48a: _start+0x2a (in /tmp/bogus_app 0x563fa15da000)
ERR:eo_lifecycle ../src/lib/eo/efl_object.eo.c:78 efl_del() obj_id=0x400000007ecf390e was already deleted 0.0000 seconds ago!
```
if for some reason we fail to validate a class, then we should skip
that extension. This may result in an empty vtable, then check for
that and avoid a crash.
This is very unlike to happen in practice, but I've forced some
validation errors and could get to that.
Instead of 2 sets of macro, one for HAVE_EO_ID and another without,
use a single set of macros and have the implementation of
_eo_class_pointer_get() and _eo_obj_pointer_get() to do the actual
These functions now take the source information so the logs reflect
that and not always the same function.
_eo_pointer_error() was kinda a bitch to debug as it provided a nice
breakpoint location, but did not provide a good output since the file,
line and function were always the same.
Change that to be a thin wrapper on top of eina_log_vprint(), then we
keep the breakpoint location yet provide useful information.
In that sense, change other error messages so they carry as much
information as possible.
This fixes T4907
The problem was that in efl_event_callback_add the internal array was
changed. If this was happening while a efl_event_callback_call was
happening the for loop got confused and skipped one event subscription.
Which led to a bug in e where the idler ufnction was not executed
probebly and so the canvas stayed frozen.
if there is a event the callback counter should be incremented not
decremented. This should fix a few crashes i found in edje, since edje
did not knew that a element was deletion.
Normally when debugging Eo with gdb you can just use any of the internal
eo functions to resolve the id to its internal pointer. However, when
loading a coredump you can't execute any code, not even the id resolve
code.
This change adds a gdb function that resolves the id to its pointer form
without executing any code in the process space. This plugin is
essentially the id resolve code written in python as a gdb function.
Usage:
Print the pointer:
(gdb) print $eo_resolve(obj)
$1 = (_Eo_Object *) 0x5555559bbe70
Use it directly (e.g. to print the class name):
(gdb) $eo_resolve(obj)->klass->desc.name
This plugin requires that the coredump would be loaded with the exact
same libeo.so binary (or at least one that hasn't changed eo internals),
and that the debug symbols for libeo.so would be available for gdb to
use.
Note:
This feature is incomplete and only resolves IDs that are owned by the
main thread and in the main domain. This is not a big issue at the
moment, because almost all of our IDs are like that.
@feature
so hunting another callback issue i noticed some of THE most popular
callbacks are:
1411 tick
1961 move
4157 pointer,move
7524 dirty
8090 damage
13052 render,flush,post
13052 render,flush,pre
13205 render,post
13205 render,pre
21706 recalc
21875 idle
27224 resize
27779 del
31011 idle,enter
31011 idle,exit
60461 callback,del
104546 callback,add
126400 animator,tick
as you can see callback del, add and the general obj del cb's are
right up there... so it is very likely a good idea to CHECK to see if
anyone is listening before calling the callback for these very very
very common calls.
this is ifdef'd and turned on for now. it can be turned off. it
shouldnt use more memory due to the way memory alignment works (likely
all allocations will be multiples of 8 bytes anyway) so we're using
spare unused space. the only q is - is management of the counts AND
checking them worth it in general? it's really hard to tell given we
dont have a lot of benchmarks that cover a lot of use cases... it
doesnt seem to slow or speed anything up much in the genlist bounce
test... so i think i need something more specific.
@optimize
i found a massive slowdown that over time ended up with 10000's of
cb's in objects like the ecore loop object. this fixes that by
ACTUALLY flagging event deletions waiting to be true rather than false.
When resolving a function for an object the object would get reference,
and then in some failure cases won't be freed.
I suspect this is a regression following the reshuffling that was done
in that function recently.
Thanks to zmike for investigating and reporting this.
Fixes T4740
@fix
This happens with shared objects.
The situation seems to be:
1. object has composited object a of class A in thread 1
2. call something on object a from thread 2, deadlock
In fact, do anything from thread 2 on a shared object and you deadlock.
this moves a lot of error case handling into goto's so the code gets
out of the hot path and this should help expecially since variou
smacros do things like:
do { char buf[256]; sprintf(buf, fmt, ptr); _eo_pointer_error(buf); } while (0)
_Efl_Class *klass; \
do { \
klass = _eo_class_pointer_get(klass_id); \
if (!klass) { \
_EO_POINTER_ERR("Class (%p) is an invalid ref.", klass_id); \
return ret; \
} \
} while (0)
so putting quite a chunk of code inside a rare "if this errors"
handler that will cause l1 cache misses and this we don't want, thus
moving stuff in eo core out of hot paths to cut down on overhead. yes
it might not be pretty but it's kind of the right thing at such a core
level of efl. this also does the same to the eo base class as this is
also going to be relatively hot given it's the core of every other
object.
so there were a few issues. one we had a spinlokc on the eoid table
for shared objects AND then had a mutex for accessing those objects
(released on return from any eo function). BUT this missed some funcs
like eo_ref, eo_unref and so on in eo.c ... oops. so fixed. but then i
realized there was a race condition. we locked the eoid table then
unlocked with our pointer THEN locked the sharted object mutex ...
then unlocked it. that was a race condtion gap. so we should share the
same lock anyway - if it's a shared object, grab the shared object
mutex then do a lookup and if the lookup does not fail, KEEP the lock
until it is released by the return from eo function or by some special
macro/funcs that released a matching lock. since its a recursive lock
this is all fine. as its also a universal single lock for all objects
we just need the eoid to know if it's shared and needs locking based
on the domain bits. so now do this locking properly with just a single
mutex, not both a spinlock and mutex and keep the lock around until
totally done with the object. this plugs the race condition holes and
goes from 1 spinlock lock and unlock then a mutex lock and unlokc to
just a single mutex lock and unlock. this means shared objects are
actually truly safe across threads and only have the overhead of a
single recursive mutex to lock and unlock in every api call.
as per other recent benchmarking, moving rearely run code (in this
case code to init the op etc.) out of the l1 cacheline prefetch inot a
blob of code at the end of the function where we goto and goto back
again should provide decent-ish speedups for the resolv cache in
avoding this code. yes it makes the code less pretty to read but at
this really low level hot path ... evil things must happen to get the
speed we want/need.
Summary:
as told in _eina_stringshared_key_cmp in eina_hash.c:
originally we want to do this:
return key1 - key2;
but since they are ptrs and an int can't store the different of 2 ptrs in
either 32 or 64bit (signed hasn't got enough range for the diff of 2
32bit values regardless of their type... we'd need 33bits or 65bits)
So changing this to the same logic.
Reviewers: tasn, raster
Subscribers: cedric, jpeg
Differential Revision: https://phab.enlightenment.org/D4298
whenan eoid lookup fails, now print a lot of information on the issue
like the actual id, generation of the id, if its a class or object
(the class bit), if its ref or super bit is set, the actual id (which
includes the table heirachy), which thread id it is, what domain the
object id is and the current and local domains as well as what domains
are mapped in.
This change lets us remove a field from the structure that leads to
around 20KiB more of saving in private dirty pages in elementary.
This also looks a bit better and feels a bit cleaner.
Breaks API and ABI.
we now just lost another bit from generation count. down to 6 in 32bit
and 26 in 64 bit. this sucks but is necessary. now we are using the
bits just below ref and super bits the code was just maskign off the
next bit as a class marker. this was so so so so wrong. it was the ide
table space. we just never used numbers high enough to start using it.
since i added domain there now those bits can be used easily with
thread domain or other domain. argh! existing eo bug found and fixed.
annoying! :) i added another #define there just to be clear we use
that bit for classes.
Before this commit, function overrides were explicit. That is, you'd
have to explicitly state you were overriding a function instead of
creating a new one. This made the code a tad more complex, and was also
a bit more annoying to use. This commit removes this extra piece of
information.
This means we now store much less information per function, that will
let us further optimise out structures in the future.
Now that we have recursive locks, the class creation code can be much simpler.
All the code there was essentially our own implementation of recursive locks,
or rather a special case of those.
This is no longer needed.
this adds a signle mutex (recursive) mutex for all eo objects that is
auto-called by _efl_object_call_resolve() and _efl_object_call_end()
that wrap all eo method calls and since its recursive it can be
blindly called for sub-calls. this will lock all shared objects during
any call to any shared object so only the thread calling now has
access until it releases. not fine-grained but good enough and the
best we can do "simplistically".
This moved all the eoid tables, eoid lookup caches, generation count
information ad eo_isa cache into a TLS segment of memory that is
thread private. There is also a shared domain for EO objects that all
threads can access, but it has an added cost of a lock. This means
objects accessed outside the thread they were created in cannot be
accessed by another thread unless they are adopted in temporarily, or
create4d with the shared domain active at the time of creation. child
objects will use their parent object domain if created with a parent
object passed in. If you were accessing EO (EFL) objects across threads
before then this will actually now cause your code to fail as it was
invalid before to do this as no actual objects were threadsafe in EFL,
so this will force things to "fail early".
ecore_thread_main_loop_begin() and end() still work as this uses the
eo domain adoption features to temporarily adopt a domain during this
section and then return it when done.
This returns speed back to eo brining the overhead in my tests of
lookup for the elm genlist autobounce test in elementary from about
5-7% down to 2.5-2.6%. A steep drop.
This does not mean everything is perfect. Still to do are:
1. Tests in the test suite
2. Some API's to help for sending objects from thread to thread
3. Make the eo call cache TLS data to make it also safe
4. Look at other locks in eo and probably move them to TLS data
5. Make eo resolve and call wrappers that call the real method func do
recursive mutex wrapping of the given object IF it is a shared object
to provide threadsafety transparently for shared objects (but adding
some overhead as a result)
6. Test test est, and that is why this commit is going in now for wider
testing
7. Decide how to make this work with sending IPC (between threads)
8. Deciding what makes an object sendable (a sendable property in base?)
9. Deciding what makes an object shareable (a sharable property in base?)
I knew Windows doesn't allow statically initialising pointers in the
global namespace, I had no idea it also applies to functions. That's
quite annoying.
Thanks to Cedric for reporting.