This reverts commit 4b116627c2.
This can't be done, because the freeze state can change from within the
callbacks so you need to check if events are frozen every time.
This is faster in most cases, and to be honest, should be much faster
than it is. I don't understand why there's no better directive to mark a
variable as *really* important thread storage that is used all the time.
We don't really need the eo_id most of the time, and when we do, it's
very easy to get it. It's better if we just don't save the eo_id on the
stack, and just save if it's an object or a class instead.
It seems that the idea behind that optimisation, is to save object data
fetching when calling functions implemented by the object's class inside
functions implemented by the object's class. This should be rare enough
not to worth the upkeep, memory reads and memory writes, especially
since for all cases apart of mixins (for which this optimisation won't
work for anyway), the upkeep is more costly than fetching the data
again.
with eo id indirection on, a nul object is silently ignored as anok
error case (like free(NULL)). but if you turne eoid off in build its
all complaints to here and the black stump. fix this by making the eo
id "off" path match eo id on by making null objects silent.
@fix
removing the klass member meant removing hooks and keeping cache small
but that meant not using it. this meand if the object is not an obj...
i removed the:
call->obj = _eo_class_id_get(call->klass);
line - seemed harmless/pointless. apparently not. so put it back but
use the klass there in local vars and not in call as it's not there
(and not needed).
fix.
we pass both the callcache and the op id - both are static and filled
in at runtime, so merge them into the same struct. this should lead to
better alignment/padding with the offset array and the next slot and
op fields, probably saving about 4-8 bytes of rame per method with no
downsides. also pass in only cache ptr, not both cache ptr and opid -
less passing of stuff around and should be better.
so. clang is wrong. end of story. it complains that i should add
braces to:
static Eo_Call_Cache ___callcache = { 0 };
WRONG. that is correct c99. 100%. you can add more {}'s and init every
field separately like {{0},{0},{0}} etc. or make it 1 or any value -
it doesn't matter... clang complains. clang is wrong. plain and
simple. this warning should just never exist. it is pointless.
but... peolpe won't shut up about clang warnings until i "fool" clang
into being silent by assuming the default 0 value of static storage.
this silences clang
BEWARE! this breaks eo ABI. _eo_call_resolve and _eo_data_scope_get
are 2 of the biggest cpu users in eo. they easily consume like 10-15%
cpu between them on tests that drive a lot of api - like simply
scrolling a genlist around. this is a lot of overhead for efl. this
fixes that to make them far leaner. In fact this got an overall 10%
cpu usage drop and that includes all of the actual rendering, and code
work, so this would drop the eo overhead of these functions incredibly
low. using this much cpu just on doing call marshalling is a bug and
thus - this is a fix, but ... with an abi break to boot. more abi
breaks may happen before release to try and get them all in this
release so we don't have to do them again later.
note i actually tested 4, 3, 2, and 1 cache slots, and 1 was the
fastest. 2 was very close behind and then it got worse. all were
better than with no cache though.
benchmark test method:
export ELM_ENGINE=gl
export ELM_TEST_AUTOBOUNCE=1
while [ 1 ]; do sync; sync; sync; time elementary_test -to genlist;
sleep 1; done
take the 2nd to the 8th results (7 runs) and total up system and user
time. copmpare this to the same without the cache. with the cache cpu
time used is 90.3% of the cpu time used without - thus a win. at least
in my tests.
@fix
so we do a bit of error handling like does a stack fail to allocate,
does setting the tls var fail, have the stack frames been nulled or
not allocated, etc. - these acutally cost every call because they mean
some extra compare and branches, but ore because they cause a lot fo
extra code to be generated, thus polluting instruction cache with code
and cacheline fetches of code that we rarely take - if ever.
every if () and DBG, ERR etc. does cost something. in really hotpath
code like this, i think it's best we realize that these checks will
basically never be triggered, because if a stack fails to grow... we
likely alreayd blew our REAL stack for the C/C++ side and that can't
allocate anymore and has already just crashed (no magic message there -
just segv). so in this case i think this checking is pointless and
just costs us rather than gets us anything.
This causes a significant speed up (around 10% here) and is definitely
worth it. The way it's done lets the compiler cache the value across
different eo_do calls, and across the parts of eo_do. Start and end.
This breaks ABI.
This may look like an insignificant change, but it doubles the speed of
this function, and since this function is called so often, it actually
improves my benchmarks by around 8%.
This breaks ABI in a harmless way, and it will give us the ability to
drastically improve Eo in the future without breaking ABI again, thus
allowing us to declare Eo stable for this release if we choose to.
My previous patch to this piece of code
(37f84b7e96), caused a significant
performance regression. This is such a hot path, that even accessing the
strings when we don't have to slows things down drastically. It makes
more sense to just store it in the structure.
This commit breaks ABI (though most people probably won't even need to
recompile anything else because of the memory layout).
It was discussed on IRC and was decided this is a big enough issue to
warrant a fix during the freeze.
@fix
We use function names instead of function pointers of Windows, because
of dll import/export issues (more in a comment in eo.c). Before this
commit we were comparing the pointers to the strings instead of the
content in some of the places, which caused op desc lookup not to work.
This fixes that.
Thanks to vtorri for his assistance.
@fix
This removes code that became dead in commit:
389c6d35f2
The commit doesn't explain why we don't shrink or grow when using mmap,
but this is how it is. No reason to keep old code there.
CID 1240224
@fix
Commit 37f84b7e96 introduced a few changes
to the callback matching mechanism that made it so sometimes callbacks
would be triggered for the wrong events. The problem was there because
of the support for legacy events that forces to do string comparison
instead of the usual pointer comparison. We should only do string
comparison when we are certain one of the callbacks is a legacy
generated one.
Regression tests will follow tomorrow. Way too late here for that.
Thanks to cedric for reporting.
As described by Carsten in his email to edev ML titled:
"[E-devel] eo stability - i think we need to postpone that"
with the switch to Eo2 we significantly increased our usage of RW memory
pages, and thus significantly increased our memory usage when running
multiple applications.
The problem was that during the migration to Eo2 the op id cache and the
op description arrays were merged, causing the op description arrays to
no longer be RO. This patch enables users of Eo (mainly Eolian) to
declare those arrays as const (RO) again, saving that memory.
There might be performance implications with this patch. I had to remove
the op desc array sorting, and I used a hash table for the lookup. I
think the op desc sorting doesn't really affect performance because that
array is seldom accessed and is usually pretty short. The hash table
is not a problem either, because it's behind the scenes, so it can be
changed to a more efficient data structure if the hash table is not good
enough. The hash table itself is also rarely accessed, so it's mostly
about memory.
Please keep an eye for any bugs, performance or excessive memory usage.
I believe this should be better on all fronts.
This commit *BREAKS ABI*.
@fix
The old naming is inconsistent with the rest of the EFL. This fixes that.
Since we are already breaking ABI (and possibly API), we should fix this too.
This hasn't been used for a while. Since we are going to break Eo a bit anyway
it's a good opportunity to drop this.
This may cause a slight performance issues with legacy events, such as
smart callbacks. This shouldn't really be a problem as we've migrated away from
them. If it does, we need to migrate the remaining parts. Only relevant
for callbacks that are added before the classes are created, which
shouldn't be possible except for smart, only for old evas callbacks.
Scenario:
- Same signal/function/data registered twice on e.g mouse_down
- On mouse_down, register mouse_move and mouse_up
- On mouse_up, unregister mouse_move
Result: mouse_move still invoked after mouse_up
Reason:
- When the mouse_move callback deletion is required, the cb is
flagged as deleted but is not freed as walking_list blocks.
- When the second (and same) has to be deleted, it will try to delete
the first again because the delete_me flag is not checked.
This patch fixes it by checking the delete_me flag when determining the
candidate.
@fix
This should not happen. Objects with parents must have their parents
unset before they reach refcount == 0. That's because the parent is the
one holding the refcount. This means that if we get to the destructor
(object is deleted) while a parent is still set, we have an error
scenario.
After this change, parent_set assigns a ref, so for example:
obj = eo_add(CLASS, parent); /* Ref is 1 */
eo_do(obj, eo_parent_set(parent2)); /* Ref is 1 */
eo_ref(obj); /* Ref is 2 */
eo_do(obj, eo_parent_set(NULL)); /* Ref is 1, giving the ref to NULL */
eo_do(obj, eo_parent_set(parent)); /* Ref is 1 */
This is following a discussion on the ML about commit
8689d54471.
@feature
optimization
xrefs keep lists of objects references. children are already in a list.
why keep both? lots of extra memory used for no value when debug is on
(pretty much most of the time).