Summary:
When harfbuzz is enabled, _content_create_ot() function will be used
for shaping. If evas_common_font_int_cache_glyph_get() failed in some reason,
it never proceed gl_itr until the end.
It can cause weird rendering result. Because, all of gl_itr after the failure
can't have proper x_bear, y_bear and width.
@fix
Test Plan: N/A
Reviewers: raster, cedric, herdsman, jpeg
Differential Revision: https://phab.enlightenment.org/D5154
Summary:
Assigning a result of integral division to a double type variable is
not useful for next division calculation. For more accurate calculation,
it needs to be casted to double before doing division.
It does not fix some bugs. It was reported by a code quality advisor.
Test Plan: N/A
Reviewers: raster, cedric, jpeg, herdsman, eunue
Reviewed By: cedric
Differential Revision: https://phab.enlightenment.org/D5069
Signed-off-by: Cedric Bail <cedric@osg.samsung.com>
This reverts commit 9368eedd35.
The release is out so we can revert this bandaid again. In the hope to
find the real culprit and solution before the next release.
Currently, elementary programs crash on termination on macOS (seems
Sierra-specific). This is very nasty, looks like deep memory corruption...
Without valgrind (or like) support on Sierra, it is difficult to
pinpoint the origin of the problem.
Due to the imminient release, and after discussion with @stefan, this
kludge will allow the release to happen.
This commit MUST be reverted just after the release, so we don't
blindfold ourselves!
Ref T5245
Summary:
There are many requests to add a new feature for handling horizontal align
according to current locale. For example, in RTL locale setting,
users want to see right aligned text for every list's item.
Even if some of list's items only contain LTR characters!
It is useful for the needs.
@feature
Test Plan: N/A
Reviewers: herdsman, tasn, woohyun, raster, cedric
Reviewed By: herdsman, raster
Subscribers: z-wony, jpeg
Differential Revision: https://phab.enlightenment.org/D4664
This is needed by dmabuf engine fallback when it realizes it locally
allocated a buffer, has been rendering to it, but the compositor can't use
it.
So the engine copies its buffer contents into a new wl_shm buffer and
continues from there - however we need to make sure the async renderer
has finished first, so we don't copy a partial buffer.
This allows us to block for all previously submit actions in the render
queue to complete.
Summary:
Rounding the sum of glyph's advance could cause inconsistency of
each glyph's positions. When Evas enables Harfbuzz library,
Each glyph's position has to be handled by only nearby glyphs.
But, currently, totally unrelated glyph's advacne could change
other glyphs positions.
ex) 1. "connect."
2. "Tap here to connect."
You can see different gap between "c" and "o" of word "connect".
It should be same even if there was a different text before the word "connect".
@fix
Test Plan: N/A
Reviewers: raster, herdsman, jpeg
Reviewed By: raster
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D4782
This is an attempt at refactoring the filters code so I can
later implement GL support. This patch adds a few extra changes
to remove avoid calling functions of libevas from the software
engine: use the draw functions from static_libs/draw rather
than evas_common APIs.
The only purpose of this commit was to allow efl 1.19 to be
released on macOS wothout crashing on termination. Time to revert
it and see that we can find a real fix for the next release.
This reverts commit cd5e755951.
ref T5245
There are reports of crashes when y < 0. This case seems
abnormal in case of filters, as I don't know how to reproduce it,
but it's happened.
Thanks Youngbok Shin for the report.
@fix
Summary:
If the last item before ellipsis item has bigger width than its advance,
evas_common_font_query_last_up_to_pos() function can find wrong ellipsis position.
When Evas finds a position for non last item, Evas must care about additionally
available space for glyph's width of the given x position.
ex) the last item's glyph before ellipsis item has a tail to draw above the ellipsis item.
@fix
Test Plan:
Test case will added as comment.
(Becasue of font license problem.)
Reviewers: herdsman, raster, jpeg, woohyun
Subscribers: cedric, Blackmole
Differential Revision: https://phab.enlightenment.org/D4727
Currently, elementary programs crash on termination on macOS (seems
Sierra-specific). This is very nasty, looks like deep memory corruption...
Without valgrind (or like) support on Sierra, it is difficult to
pinpoint the origin of the problem.
Due to the imminient release, and after discussion with @stefan, this
kludge will allow the release to happen.
This commit MUST be reverted just after the release, so we don't
blindfold ourselves!
Ref T5245
If GL context is free'd before processing font shutdown,
textures for emoji glyph's GL images will be free'd without clean
up its GL images. It causes eina mempool infinite loop issue when
emoji's GL images are free'd in shutdown process.
So, the patch will make a list for emoji's GL images in context and
clean up them when the context is free'd. Just like font textures in
context.
@fix
Differential Revision: https://phab.enlightenment.org/D4695
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
Summary:
SETUP_LINE_SHALLOW and SETUP_LINE_STEEP are each identically defined
(except whitespace) in evas_line_main.c
Reviewers: cedric, jpeg
Subscribers: jpeg, cedric
Differential Revision: https://phab.enlightenment.org/D4681
it's a warning one way or another so reduce noise with a harmless case
as passing in a pit ro a 32bit type is more restrictive than the ptr
it accepts (an 8bit type)
so a little perf fun shows malloc/free/realloc/etc. are, combined a
reasonable overhead. this reduced malloc overhead for draw contexts so
whne we duplicate them or create new ones, we re-use a small cache of
8 of them to avoid re-allocation. just take the first one from the
list as it really is that simple. mempool would not have helped more
here and cost more overhead.
@optimize
In gl engine, image objects try to unload image's pixel data after creating or updating the texture.
but image entry's reference is still 1, it is added to the pending_unloads list,
and it is cleaned when evas render function.
If elm image use preload feature, preload_done flag is true, so this image data cannot be removed from
pending_unloads list, it cause memory leak.
I think it is better to free image's pixel data in evas_cache_image_unload_data,
(not add to the pending_unloads list)
but it it complicated to modify.
so I'll remove the code to check preload_done flag in evas_common_rgba_pending_unloads_cleanup function.
this flag check was added because of gl preloading, but now gl preloading feature is disabled.
this flag is related with https://phab.enlightenment.org/D2823
I tested photocam, but crash doesn't occur anymore, even though removing flag check.
to date if you use async preload we still load the header
synchronously and this can be horrible especially with generic
loaders. there is no way to farm this off to the preload thread. now
there is. youhave to set it as a skip head load option before doing a
file_set AND you need to issue a preload ... but now it's possible.
@feature
i found evas_common_draw_context_apply_cutouts() was procsessing 300+
cutouts and as it's O(n^2)/2 to try and merge adjacent rects for
cutouts this really performs like complete junk. we apply cutout rects
a LOT. this is not the best solution, but it's quick and much faster
than doing the clipouts which drop framerate to like 1-2fps or so in the
nasty case i say (tyls -m of photos in a dir with a 2160 high
terminal).
this figures out the target area to limit the count of rects
significantly so O(n^2) is far far better when n is now < 10 most of
the time. and for the few operations where it's a high value this now
uses qsort to speed up merges etc. etc.
@optimize
this doenst change functionality but just cleans up the file
whitespacing and formatting and removed commented out junk, 80 column
wrapping/overflow etc.
We have an issue that eina_thread_queue msg isn't delivered properly on win32.
That occurs broken image drawing in case of non-smooth scaling.
I disabled this feature on win32 because scale_sample_draw is gonna be rarely used
since async rendering introduced.
elm_suite would crash when CK_FORK=no is set, because evas was
badly initializing or shutting down. Note that elm_suite still
crashes with CK_FORK=no but valgrind doesn't complain.
Summary:
In case of thread creation failure, shutdown logic will be stuck.
To prevent stuck, set exit variables to make thread_shutdown working
even if init fails.
Also modify init logics to return init result to a caller.
Reviewers: jypark, woohyun, cedric, jpeg
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D4411
Note (@jpeg):
I have modified the patch just a little bit.
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
Compiling on rpi3 indicated that there are some unused variables in
the neon codepaths for several evas op functions. This patch just adds
EINA_UNUSED to the function parameters where needed.
NB: No functional changes
Signed-off-by: Chris Michael <cp.michael@samsung.com>
Compiling on rpi3 using neon indicates that 'alpha' and 'tmp'
variables are unused. Reading through the source confirms it, so
remove them.
Signed-off-by: Chris Michael <cp.michael@samsung.com>
Moved rects caching into draw context to avoid the use of __thread
slot. Draw context are defined per thread anyway and should be just
fine. This doesn't really change the picture regarding glibc problem
when to many __thread are needed, but slightly improve the global
picture. Also this patch doesn't affect our performance in expedite
benchmark as far as I can tell.
so bu5hman pointed out a compile warning from clang that
{ 0x20000, 42711, EVAS_SCRIPT_HAN },
has 42711 exceeding a signed short. true. so this should be an
unsigned short. but this drew me to the fact the whole array could be
shorter by packing this short with the style memeber after it making
them pack into a nicely aligned 4 byte chunk next to the start unicode
value before it, thus chopping 1324 bytes off this table. even worse
the 8192 entry fast table above is using a full 32bits per entry where
they data they store is not even exceeding 7bits, so move this to an
unsigned char saving another 24k. this should reduce cache misses and
memory footprint and binary footprint of the evas .so files etc.
@fix + @optimize
Ref T4623
v40 bytecode interpreter is official as of freetype 2.7.
The results don't look so good at the moment. The text looks and glyph
positioning seem worse than they were with the previous v35 interpreter.
So, in the meantime we'll keep using v35, just so everything looks
normal again.
Although the v40 is relevant since around 2.6.3, I rather not do any
FREETYPE_MINOR checks in this patch, because distributions might ship
previous versions with the other (v38) interpreter enabled.
We've been pinning the render thread for every EFL process to core 0.
This is a bit silly in the first place, but some big.LITTLE arm systems,
such as exynos 5422, have the LITTLE cores first.
On those systems we put all the render threads on a slow core.
This attempts to fix that by using a random core from the pool of fast
cores.
If we can't determine which cores are fast (ie: we're not on a
linux kernel with cpufreq enabled) then we'll continue doing what we've
always done - pin to core 0.
We've been pinning the render thread for every EFL process to core 0.
This is a bit silly in the first place, but some big.LITTLE arm systems,
such as exynos 5422, have the LITTLE cores first.
On those systems we put all the render threads on a slow core.
This attempts to fix that by using a random core from the pool of fast
cores.
If we can't determine which cores are fast (ie: we're not on a
linux kernel with cpufreq enabled) then we'll continue doing what we've
always done.
I got an issue report about map rendering.
After investigated, I found that was introduced by data overflow.
For fast computation, evas map uses integer data type rather than float,
that gives up some range of data size.
So, if vertex range is a little large but still reasonable,
polygon won'be properly displayed due to the integer overflow.
We can fix this by changing FPc data type to 64 bits (ie, long long)
But I didn't do yet though I can simply fix this costlessly.
By the way, my test case map points are below.
0: -1715, -5499
1: -83, -1011
2: 1957, 5721
3: 325, 1233
and gl result is perfect but sw is totally broken.
@fix
This brings support for the eo api for external buffers (like
the old data_set / data_get). The new API now works with slices
and planes.
The internal code still relies on the old cs.data array for
YUV color conversion. This makes the code a little bit too
complex to my taste.
Tested with expedite for RGBA and YUV 422 601 planar, both
SW and GL engines (x11).
It's entirely possible that a system doesn't have a cpu 0, so
when we try to pin all our render threads onto processor 0 we
may fail.
This results in some very connfusing build breakage when
edje_cc hangs up because its render thread didn't start.
So, if starting the thread with affinity fails, let's try without
affinity.
(This is trivial to reproduce - just use sysfs to turn off cpu0
after boot.)
@fix
this speeds up downscaling of images by somewhere between 1.8 to 3x
dpeending on case and cpu etc. - this is ONLY for downscaling of an
image buffer betweeb 50% width and/or height up to 100% of width and
height. it's a special case optimization that cuts down the complexity
of the full super sampling filter to just do a bilinear interpolation
which is actually strictly correct for this size range and shouldn't
drop quality. it uses fixed point (16.16) to do the sup pixel sampling.
no mmx/asse or neon, but we could actually easily use it as we do use
mmx/ee and neon in the bilinear upscaler to do interpolation so this
would work here too. it just requires time and effort to make yet 2x
more special cases and use the ASM to do the hard slog here.
@optimize
This is essentially a cleaner redo of ef817f15f0.
Logic should be exactly the same as there, the different is that this
one shares the code between OT and non OT.
Please refer to that commit for more information.
This was not done correctly. This split the code, which is essentially
the same for both OT and non OT. It's the same logic with some minor
additions for OT, so most of the code should be together.
This reverts commit ef817f15f0.
Fixes T4068.
Simply querying the last glyph to determine the width of the glyph sequence
won't always work, as OT can have negative offsets (adjusts the placement of a
specific glyph better).
The solution is to calculate the "max width" of some sequence that will
guarantee us proper width results. The worst solution would be to iterate on all
the glyphs and sum up the max width. This is a bit impractical. Instead, we will
inspect just the "cluster" of the last glyph, if one exists.
This should have no performance impact on trivial cases, and very little impact
on the others.
@fix
some font glyphs are still allocated after tyhe last gl window is
freed which means we can't make current anymore to free textures after
that. this fixes that by flushing gl texture info from the font cache
when the last gl windows are gone.
@fix
We were copying a user defined string into a fixed size buffer
without doing any boundary checks. This commit fixes that.
Also cleaned up similar code that was using hardcoded numbers.
@fix.
Summary:
evas_common_language_from_locale_* functions kept static pointers
inside of its functions. Once these function was called, it was never reset.
It made big problems for harfbuzz and hyphenation. Also, Elementary
provides elm_language_set() API. Then we need to support it fully.
@fix
Test Plan: Test case for hyphenation is included in Evas test suite.
Reviewers: raster, tasn, herdsman, woohyun, z-wony, Blackmole, minudf
Subscribers: cedric, jpeg
Differential Revision: https://phab.enlightenment.org/D3864
this works with 7166e6b859 and fixes a
leak added because ... free does not free!
evas_common_draw_context_cutouts_real_free(0 now actually frees the
rects, but evas_common_draw_context_cutouts_free() before did not.
@fix (follow on from 7166e6b859)
this fixes the fix 4d6a8a7fce to be
portable to platfomrs that do not support __thread - seemingly openbsd
does not (argh!) and maybe others. so on these platforms then they
dont get the optimization of keeping a cutout rect pool to avoid
re-allocation.
this also every 4096 draws "resets" the cutout cache so it doesnt
expand and stay expanded forever.
@fix
several draw funcs keep a static Cutout_Rect *rects = NULL; variable
to cache cutout rects to avoid re-allocating them a lot etc. this is
fast and handy but we may use these from multiple threads. thats bad
.... mmmkay. so this fixes it the dirty way - makes them thread local.
:)
this fixes T3348 - the crash mentioned by @zmike
@fix
Width calculations should consider the x_bear. This has been leading to
inconsistent results between wrapping calculation during layout and the
final formatted size.
Also, we should stop our walk only when exceeding 'x', so changed "<="
to "<".
@fix
Summary:
Previous implementation loaded data from memory first and then checked the borders.
Here I check the borders first as it is for C implementation.
This prevents read of non-accessible memory.
Reviewers: cedric, jypark, Hermet, jiin.moon, jpeg
Reviewed By: jpeg
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D3809
Those APIs should provide a cleaner interface than the
old data_set/data_get APIs, by making sure the operations are
atomic (ie. no need to call size_set, cspace_set and then data_set).
padding/duplicated borders are not supported.
TODO: Implement legacy API on top of the new API, instead of
this quick patch
Summary:
Evas Text, Textblock, Textgrid keeps own language information.
This language information could be vary from the result of setlocale().
Especially, Evas Textblock supports <lang> tag. The language could be
changed in the middle of text. All of these language has to be used
for harfbuzz shaping.
@fix
Test Plan: N/A
Reviewers: herdsman, raster, woohyun, tasn
Reviewed By: tasn
Subscribers: cedric, jpeg
Differential Revision: https://phab.enlightenment.org/D3628
See T2865.
Since Harfbuzz 1.1.0, terminology displays fonts funnily aligned to
the top. This is apparently because until 1.0.6 the y_offset was
always 0 for all glyphs, but since 1.1.1 the offset is actually
set.
This is a TEMPORARY fix. There might be an underlying issue left
here.
Harfbuzz changed behaviour in this commit:
commit 44f82750807475aa5b16099ccccd917d488df703
Author: Behdad Esfahbod <behdad@behdad.org>
Date: Wed Nov 4 20:40:05 2015 -0800
[ft] Remove font funcs that do nothing
Summary:
It adds evas_object_paragraph_direction_set, get APIs.
The APIs set or get paragraph direction to/from the given object.
It changes BiDi calculations and affect the direction and aligning of text.
It doesn't have any effect to text without Fribidi library.
The default paragraph direction is EVAS_BIDI_DIRECTION_INHERIT.
If dir is EVAS_BIDI_DIRECTION_INHERIT, paragraph direction is changed
according to smart parent object. If there is no smart parent object,
paragraph direction works as EVAS_BIDI_DIRECTION_NEUTRAL.
@feature
Test Plan:
Test cases included to the following files.
- evas_test_textblock.c
- evas_test_text.c
- evas_test_object_smart.c
Run "make check".
Reviewers: woohyun, raster, herdsman, tasn
Subscribers: c, raster, cedric
Differential Revision: https://phab.enlightenment.org/D1690
This function was trying to infer from the LANG env var, though it should
have just queried the locale all along, as the language we want is the
system's text language, and not necessarily the LANG variable's value.
@fix.
Summary:
Move model save/load to common3d.
Here also will be common algorithms and structures which will be used in all loaders and savers.
See task https://phab.enlightenment.org/T2713.
Reviewers: cedric, Hermet, raster, Oleksander
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D3030
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Typos, lack of NULL check, excessive sizeof(type) not matching
the object type, no border set, etc... This all lead to a crash
and then no render (with an error message and then without...).
This also simplifies the implicit loading of ETC1 as ETC2 when
supported by the driver.
@fix
Actually copying max is pretty useless and super slow. We usually have something
like 1024 slot in a context, but a very small amount of them are acutally active.
It would be better to actually do some kind of copy on write technique here, but
as Eina_Cow doesn't handle array and we are close to a release, let's be
conservative.
the overhead didnt show up in y tests. do show up with certain
expedite tests. hmmm. last time i messed with region code it was
actually same speed as tiler. bonus was it was fully accurate.
valgrind pointed this one out. we access freed memory when we dup a
context because the context CONTAINS ptrs to things like rects for
cutouts. we didnt dup these. use the proper context dup call (and
properly ref pixman color image too). this was a random bug/crash
waiting to happen and valgrind caught it. suprising it hasnt turned up
before :/
@fix
this optimizes draw ctxt cutouts by skipping small ones and
remembering the last cutout added so it isn't double-added as well as
extending the minimum cutout array to 512 and going up in blocks of
512 instead of 128. also optimize the clipping code a bit more.
this move evas tiler that does update handling to use fully correct
regions using region.[xh]. this also removed old unused regionbuf code
and a bunch of commented out code no longer needed. much simpler now
and easier to maintain.
For script runs that start with an UNKNOWN character, the whole
run was mistakenly identified as script type UNKNOWN.
Also, refactored code a bit for readability.
Fixes T2670.
@fix
i am not sure if this is the odd crash i am seeing, but in theory it
could be. as these crashes are rare it's hard to find and gdb is "too
late" other than telling me the image is freed already by the time we
do an unload.
Summary:
Now Evas gl preload feature is disabled.
But if it is turned on, memory crash occurs.
Because evas_gl_common_texture_upload is not excuted immediately.
Test Plan: EVAS_GL_PRELOAD=1 ELM_ENGINE=gl elementary_test -to "photocam"
Reviewers: raster, cedric, woohyun, seoz, Hermet, singh.amitesh, jpeg
Subscribers: jpeg, cedric
Differential Revision: https://phab.enlightenment.org/D2823
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
i was runing perf top and noticed that evas_image_load_file_data_eet(0
was being called. in fact - it was #1 on the list of functions being
called. why? it didn't make sense. i found out. just a blinking cursor
in terminology was causing the background to be unloaded and
re-loaded. the new "actually unload" changes for 1.15 made this happen
and thus we kept sucking in new data all the time even if the
scalecache already had the data - and that was the problem.
so now calcecache prepare tells you if you don't have cached data and
if you likely then have to ensure the data is loaded. this cuts down
quite a bit of work.
while i'm at it... we definitely need to clean house on the internals
of evas. a decade+ of features, mess, optimizations needs to be fixed.
i mean really house-cleaned. rewritten clenl;y re-using existing code
where appropriate.
i think this has been disabled for a while. image unloading is broken
- esp with gl enigne as due to async move it was effectively disabled.
this re-enables it. unloading is deferred with a managed list of things
needing unloading and then when any async sw renders are not busy any
more - do the unload then in the mainloop of all pending/flagged
images to unload
@fix
Otherwise there would be conflicts in certain circumstances.
This also requires adding const on many existing functions,
and similar work is necessary in Elementary.
@fix
if yuou use 709 instead of 601 yuv (ycbcr) evas will just be wrong and
use 601. this fixes that and implements 709. it also fixes a scaling
bug for yuv in the gl engine. no one noticed but me, so i won't call
this a bug fix, and it can go into the next efl release - no need to
backport unless it actually bothers peolpe (which it seemingly doesn't)
Summary: This fixes build for aarch64 when TILE_ROTATE is disabled and BUILD_NEON is enabled(it is enabled by default for aarch64 since https://phab.enlightenment.org/D2309).
Reviewers: cedric, raster
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2498
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
In GCC 5.1 arm_neon header for aarch64 was changed. It is not possible anymore to silently cast uint64x1_t to int.
So replace cast with proper getter function to avoid following error:
lib/evas/common/evas_convert_color.c:50:18: error: incompatible types when assigning to type 'DATA32 {aka unsigned int}' from type 'uint64x1_t'
nas += vpaddl_u32(vpaddl_u16(vpaddl_u8(cmp)));
Reviewers: raster, cedric, devilhorns
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2443
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2442
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2441
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2440
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
NEON intrinsics can be built both for armv7 and armv8.
There were no NEON variant for this function, so it was added with all copies to init function.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2417
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Use vceqq and vbsl instead of twice as much vmovl and vadd instructions.
Replace vaddq_u8 with vaddq_u32.
This allows NEON code to behave exactly like C version.
Reviewers: raster, cedric
Reviewed By: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2361
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Use vceqq and vbsl instead of twice as much vmovl and vadd instructions.
Replace vaddq_u8 with vaddq_u32.
This allows NEON code to behave exactly like C version.
Reviewers: cedric, raster
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2362
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Add new define, BUILD_NEON_INTRINSICS to control whether NEON inline code or
NEON intrinsics should be built.
GCC NEON intrinsics can be built both for armv7 and armv8. However NEON inline
code can be built only for armv7.
@feature
Reviewers: raster, stefan_schmidt, cedric
Subscribers: cedric, stefan_schmidt
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2309
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
When processing random data result of this function differs from C variant in more than 50% cases.
This difference is due to alpha calculation, in C code :
alpha = 256 - (*s >> 24)
in NEON:
"vmvn.u8 q4,q0 \n\t"
// ie ~(*s>>24) === 255 - (*s>>24)
We cant just add "1" as overflow will occur in case (*s>>24) == 0 (we use only 8 bit per channel in vector registers)
So here is the solution:
copy *d right before multiplication and add it to the result of it later.
Same approach as in D455.
Reviewers: raster, cedric, stefan_schmidt
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2308
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Make sure not to sample the mask image outside its boundaries.
This is a series of last resort checks. I can not reproduce the
crashes but know they have happened.
I used EINA_UNLIKELY more for clarity than for compiler optimizations.