Actually copying max is pretty useless and super slow. We usually have something
like 1024 slot in a context, but a very small amount of them are acutally active.
It would be better to actually do some kind of copy on write technique here, but
as Eina_Cow doesn't handle array and we are close to a release, let's be
conservative.
the overhead didnt show up in y tests. do show up with certain
expedite tests. hmmm. last time i messed with region code it was
actually same speed as tiler. bonus was it was fully accurate.
valgrind pointed this one out. we access freed memory when we dup a
context because the context CONTAINS ptrs to things like rects for
cutouts. we didnt dup these. use the proper context dup call (and
properly ref pixman color image too). this was a random bug/crash
waiting to happen and valgrind caught it. suprising it hasnt turned up
before :/
@fix
this optimizes draw ctxt cutouts by skipping small ones and
remembering the last cutout added so it isn't double-added as well as
extending the minimum cutout array to 512 and going up in blocks of
512 instead of 128. also optimize the clipping code a bit more.
this move evas tiler that does update handling to use fully correct
regions using region.[xh]. this also removed old unused regionbuf code
and a bunch of commented out code no longer needed. much simpler now
and easier to maintain.
For script runs that start with an UNKNOWN character, the whole
run was mistakenly identified as script type UNKNOWN.
Also, refactored code a bit for readability.
Fixes T2670.
@fix
i am not sure if this is the odd crash i am seeing, but in theory it
could be. as these crashes are rare it's hard to find and gdb is "too
late" other than telling me the image is freed already by the time we
do an unload.
Summary:
Now Evas gl preload feature is disabled.
But if it is turned on, memory crash occurs.
Because evas_gl_common_texture_upload is not excuted immediately.
Test Plan: EVAS_GL_PRELOAD=1 ELM_ENGINE=gl elementary_test -to "photocam"
Reviewers: raster, cedric, woohyun, seoz, Hermet, singh.amitesh, jpeg
Subscribers: jpeg, cedric
Differential Revision: https://phab.enlightenment.org/D2823
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
i was runing perf top and noticed that evas_image_load_file_data_eet(0
was being called. in fact - it was #1 on the list of functions being
called. why? it didn't make sense. i found out. just a blinking cursor
in terminology was causing the background to be unloaded and
re-loaded. the new "actually unload" changes for 1.15 made this happen
and thus we kept sucking in new data all the time even if the
scalecache already had the data - and that was the problem.
so now calcecache prepare tells you if you don't have cached data and
if you likely then have to ensure the data is loaded. this cuts down
quite a bit of work.
while i'm at it... we definitely need to clean house on the internals
of evas. a decade+ of features, mess, optimizations needs to be fixed.
i mean really house-cleaned. rewritten clenl;y re-using existing code
where appropriate.
i think this has been disabled for a while. image unloading is broken
- esp with gl enigne as due to async move it was effectively disabled.
this re-enables it. unloading is deferred with a managed list of things
needing unloading and then when any async sw renders are not busy any
more - do the unload then in the mainloop of all pending/flagged
images to unload
@fix
Otherwise there would be conflicts in certain circumstances.
This also requires adding const on many existing functions,
and similar work is necessary in Elementary.
@fix
if yuou use 709 instead of 601 yuv (ycbcr) evas will just be wrong and
use 601. this fixes that and implements 709. it also fixes a scaling
bug for yuv in the gl engine. no one noticed but me, so i won't call
this a bug fix, and it can go into the next efl release - no need to
backport unless it actually bothers peolpe (which it seemingly doesn't)
Summary: This fixes build for aarch64 when TILE_ROTATE is disabled and BUILD_NEON is enabled(it is enabled by default for aarch64 since https://phab.enlightenment.org/D2309).
Reviewers: cedric, raster
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2498
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
In GCC 5.1 arm_neon header for aarch64 was changed. It is not possible anymore to silently cast uint64x1_t to int.
So replace cast with proper getter function to avoid following error:
lib/evas/common/evas_convert_color.c:50:18: error: incompatible types when assigning to type 'DATA32 {aka unsigned int}' from type 'uint64x1_t'
nas += vpaddl_u32(vpaddl_u16(vpaddl_u8(cmp)));
Reviewers: raster, cedric, devilhorns
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2443
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2442
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2441
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary: NEON intrinsics can be built both for armv7 and armv8.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2440
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
NEON intrinsics can be built both for armv7 and armv8.
There were no NEON variant for this function, so it was added with all copies to init function.
Reviewers: raster, cedric
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2417
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Use vceqq and vbsl instead of twice as much vmovl and vadd instructions.
Replace vaddq_u8 with vaddq_u32.
This allows NEON code to behave exactly like C version.
Reviewers: raster, cedric
Reviewed By: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2361
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Use vceqq and vbsl instead of twice as much vmovl and vadd instructions.
Replace vaddq_u8 with vaddq_u32.
This allows NEON code to behave exactly like C version.
Reviewers: cedric, raster
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2362
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
Add new define, BUILD_NEON_INTRINSICS to control whether NEON inline code or
NEON intrinsics should be built.
GCC NEON intrinsics can be built both for armv7 and armv8. However NEON inline
code can be built only for armv7.
@feature
Reviewers: raster, stefan_schmidt, cedric
Subscribers: cedric, stefan_schmidt
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2309
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Summary:
When processing random data result of this function differs from C variant in more than 50% cases.
This difference is due to alpha calculation, in C code :
alpha = 256 - (*s >> 24)
in NEON:
"vmvn.u8 q4,q0 \n\t"
// ie ~(*s>>24) === 255 - (*s>>24)
We cant just add "1" as overflow will occur in case (*s>>24) == 0 (we use only 8 bit per channel in vector registers)
So here is the solution:
copy *d right before multiplication and add it to the result of it later.
Same approach as in D455.
Reviewers: raster, cedric, stefan_schmidt
Reviewed By: cedric
Subscribers: cedric
Projects: #efl
Differential Revision: https://phab.enlightenment.org/D2308
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Make sure not to sample the mask image outside its boundaries.
This is a series of last resort checks. I can not reproduce the
crashes but know they have happened.
I used EINA_UNLIKELY more for clarity than for compiler optimizations.
As C version,
It increase alpha value by 1 to avoid loosing of the remains while it divides
values. Neon version does same technique to make same results.
This patch reduces power consumption by around 18mA in certain scenarios
(music player list scroll, my files sound list scroll), making
evas_common_convert_argb_premul() ~60% faster (6.2msec->2.6msec).
Take music-player application, make 100 copies of the standard
Over the Horizon” song, scroll up and down to see those
downscaled-from-720x720 thumbnails enter and leave the screen.
Every time a list item enters the screen, the image is re-read
(as evas image cache is not large enough to store more than two
pictures of that size), and one call of _common_convert_argb_premul()
occurs, taking ~6.2msec (which is not much compared to ~60msec
spent in libpng->libz (the biggest bottleneck here),
but still noticeable).
A similar power consumption improvement is observed during
scrolling sounds list of the same files in My Files application
(just with idle level ~100mA lower).
We also checked the new code to be correct on random input data.
all tests are performed based on tizen device.
Signed-Off-By: Artem Dergachev <dergachev.a@samsung.com>
prev logic increased the alpha channel by 1 so the unpremul resulted in the color got too diff from the origin.
We can't avoid losing the rest values while dividing values in premul/unpremul()
but this will recover the value better closed to origin value.
previously, it had the remaining value issues on blending computation.
The blending color result was in correct.
Signed-Off-By: Vladimir Kuramshin <v.kuramshin@samsung.com>
This is for now just a small experiment. It was based on the experiment made
with OpenMP. I prefered to only use Eina here as we have already all the infrastructure
to do this nicely and simply. As a result I get a 65% speed improved on average for
the involved scaling operation. The secondary CPU is on my laptop running with a load of
75% percent. I don't have right now the time to do power consumption analysis, but I
think it shouldn't be to bad. I am also not throwing more core at this as we are not able
to use the second core at its max already, so additional core may result in a bigger
energy loss without enough gain.
A rare case of garbage data would happen if smooth scaling
was called with a mask and 1:1 scaling. Use the proper
render_op to COPY for the first pass.
@fix
Well... actually this is not exactly a fix.
It just restores the previous behaviour, and allows AA to
work. As in, it won't draw ugly black lines but properly
blend to transparent.
But there is still a problem:
The image map render function changes the alpha flag on the source
image if AA is enabled or if the map has an alpha color. This is
actually wrong as images forcefully set to not have any alpha
(with evas_object_image_alpha_set(0)) will then not be opaque
anymore.
Right now I can't think of a solution (also I don't quite follow
the entire pipeline in evas map...). Changing the flag will
make some opaque areas transparent. Not changing the flag will
produce ugly artifacts where AA blending should happen. Fix one
bug and the other appears, and vice versa.
This can be tested with the example evas-map-aa and adding an
alpha channel to cube1.png (with gimp for instance) but manually
setting alpha to 0 in the code. Weird stuff will happen (try
playing with the map and pressing I to switch to/from image mode).
The selected op func was not performing the correct operation,
thus producing rendering artifacts. These functions should not
be used anywhere except in case of masking... which was not an
available option earlier.
It was doing (wrong):
dst = interp(mask, src, dst)
Instead of (correct):
dst = dst + (1 - mask) * src
NOTE:
This commit also disables MMX, SSE3 & NEON implementations of
pixel_mask blend operations, since they are also broken.
Work done by Jaeun Choi, rebased & squashed by jpeg.
This commit introduces changes to the low-level draw functions
of the SW engine considering the existence of an alpha mask image.
Features:
- Font masking (TEXT, TEXTBLOCK),
- Rectangle masking,
- Image masking (all image scaling functions should be handled).
The mask image itself is not yet set in the draw context (see
following commits).
@feature
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
So I've discovered some weird output values after drawing
some text. The destination alpha would become 0xFE even
when the back buffer had a background with 0xFF alpha.
Example:
Dest is 0xff00ff00 (green).
Color is 0xffffffff (white).
Current font alpha is 170 (0xaa).
--> Output was 0xFEaaFEaa instead of 0xFFaaFFaa.
This is because of some slightly invalid calculation
when doing the font masking (mtab[v] = 0x55 above).
Indeed, MUL_256 takes alpha values in the range [1-256]
and not [0-256] as was assumed.
This should ensure that the difference between the original
pixel value and the rle4 encoded one is <= 8.
The previous fix was a bit stupid as it was not taking into
account the conversion a4 to a8 (which is a8 = (a4 << 4) | a4).
Clipper causes the different rendering result by last 1 pixel on the width.
Because the left edge x range (0 ~ (w - 1)) and right edge x range (0 ~ w) is different.
This fix won't be memory over access problem even if x span position is on the end of the edge.
Because the span width(x2 - x1) will be 0, and it restuls in skipping drawing.
It's hardly find the problem but you can detect the subtle rendering difference when some arbitrary meshes with map is
You can compare image and rectangle map drawing for this.
@fix
Idea originated from Cedric the b0rker.
This is a big fat search-and-replace commit.
This commit also introduces space changes... Sorry for the mix.
NOTE: This commit may have one side effect as there was some very
dubious code chaning the dst image's alpha flag in the
Gfx get functions. Logically this didn't make sense (at
draw time the dst alpha should already be well defined),
so it should be safe.
Also, mark some functions with a FIXME as they look just wrong.
COPY_REL is never used...
MMX and NEON optimizations should be implemented for COPY MASK+COL.
Summary:
Without compilation will fail on :
error: unknown type name 'pix_type'
error: expected identifier or '(' before 'else'
Applies to efl-1.11.0 and later
Bug: https://phab.enlightenment.org/T1620
Bug-Tizen: PTREL-737/part
Change-Id: Idbcb442803ed6559698b2a371d1d6c584ec053e0
Signed-off-by: Philippe Coval <philippe.coval@open.eurogiciel.org>
Test Plan:
gbs build -P "profile.tizen_common_armv7l" --arch armv7l --include-all
@fix
Reviewers: cedric
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D1399
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
Now, the evas loader is supposed to advertise the actual border
size in case of compressed texture formats.
The only case where the border was non zero was ETC formats,
from the TGV loader, so I think we don't need to keep the
previous behaviour (auto-calculate borders for ETC).
ecore_evas_convert: Add -e/--encoding option
This uses directly the encoding parameter.
For now, used only by the TGV saver, but there is no other way
to specify between ETC1 and ETC2. And we don't have a mixed ETC1+2
mode (yet).
@feature
"f<color=#f00>i</color>f" could cause textblock to crash. It doesn't
crash anymore. It doesn't render the colours correctly either, but at
least this is the first step.
This is the start of fixing T1308
@bugfix
Summary: The comparison dc with NULL is not necessary. So remove the unnecessary conditional expression.
Reviewers: Hermet
Reviewed By: Hermet
CC: seoz, cedric
Differential Revision: https://phab.enlightenment.org/D908
The TGV file format is specifically created for Evas. It is designed to allow
region decompression and parallele decompression with a fast path for GPU that
do handle ETC1 compression. Plan for adding other compression method will come
later.
configure: fix prerequisite header issue
Summary:
in some platforms like openBSD <sys/socket.h> must be included before
net/if.h
the canonical way to ensure that with autotools is by providing that
fourth directive.
evas: use MAP_ANON instead of MAP_ANONYMOUS
Stupid unpredictable standards (or not so standard).
MAP_ANON exists and is defined almost anywhere unlike MAP_ANONYMOUS
Let's use that for portability's sake (they are practically identical
anyway)
Reviewers: raster, cedric
Reviewed By: cedric
CC: cedric
Differential Revision: https://phab.enlightenment.org/D616
Signed-off-by: Cedric BAIL <cedric.bail@free.fr>
This happens with many texts. The issue occurs when the width of the
last char is larger than it's advance. Before this patch, we didn't the
width into account when calculating width, thus causing clipping issues
in some cases.
in all other convert functions, dst_jump is provided in pixesl and
multiplied by the number of bytes-per-pixel either explicitly or
implicitly by using a different type for dst pointer (DATA16,
DATA32...).
As in 24 bits we use DATA8 we must explicitly multiply dst_jump by 3.
The structure should not be changed, despite the union modification.
I am renaming for consistency with older branches that had a mask
field in RGBA_Image. Also, the mask.data or data8 is really just
a way to avoid casting between DATA8 and DATA32 (and it shows
clearly what kind of data you are dealing with).
_op_blend_pan_mas_dp is just a duplication of the code in
_op_blend_pas_mas_dp. Remove the extra copy of the code and use a define
instead; this is what the SSE3 code already does.
Summary:
There's nothing SSE3-specific about that macro, let's use a more generic name
for it. Since that's just a generic macro, we can also allow non-SSE (eg.
NEON) code to use it if they want to
Reviewers: cedric
CC: cedric
Differential Revision: https://phab.enlightenment.org/D528
Well, raster did some great job at optimizing font draw... but only
to RGBA32 targets. In this font effects case, we also want to render
text on ALPHA buffers.
For now, reuse the existing alpha blending & glyph decompress
functions. It's MUCH easier, and works. Definitely slower than
decompressing on-the-fly and optimizing everything. But for now,
this will not even be the performance bottleneck in an effect
(blur will be a lot slower).
Evas is an RGBA only engine, BUT we also use some alpha masks,
especially in the font rendering pipeline.
This commit adds basic support for alpha buffer operations
(blend and copy).
RGBA_Image can then point to either alpha-only data, if
its colorspace is grey.
This is a long awaited feature that has been requested years ago.
Fontconfig finally added the support needed to make it happen, so here
it is.
I added a fontconfig query to look for similar fonts in case we loaded a
font from eet/edje/file(no fontconfig). This now works quite well.
Still missing: if you load a bold/italic/whatever font directly (set the file)
without putting ":weight=bold" you will not get run-time emboldenment if
only non-bold fonts are found.
This unfortunately depends on very recent fontconfig version (#ifed out
when unavailable), so only people with fontconfig >= 2.11 will enjoy
this feature.
This function does the following operation:
COPY pixel x mask --> dst
But it wasn't iterating over the source. So it was repeating
the value of the first pixel over and over again.
Is this even used anywhere? RGBA + alpha mask function!?
Summary:
When processing random data result of this function differs from C variant in more than 50% cases.
This difference is due to alpha calculation, in C code :
a = 256 - (c >> 24)
in NEON:
"vmvn.u8 q7,q6 \n\t"
// ie (8 bit)~(c>>24) === 255 - (c>>24)
We cant just add "1" as overflow will occur in case (c>>24) == 0 (we use only 8 bit per channel in vector registers)
So here is the solution:
copy *d right before multiplication and add it to the result of it later.
This makes the function slower by 20-30% but it is still at least 2 times faster then C code.
Reviewers: raster
Differential Revision: https://phab.enlightenment.org/D455
this changes the internal encoding of font glyphs in evas to use 4bit
uncompressed if small, or 4bit rle (run length encoded) if larger.
this caves at least 50% of memory on fonts - and more if bigger. with
large fonts (40-80pixel size) we can save in the region of 80% of
memory used for glyphs. this also happesn to allow speedups in
rendering too.
We do have mmap provided by Evil, but there is no implementation yet of
an anonymous map support. Also it is not clear how the memory system of
windows does actually work, so not sure this optimization is relevant
to windows at all. Thus we disable it for the time being and unbreak
the windows support.
- cherry-pick me -
Summary: evas_scale_smooth would not compile with BUILD_NEON set
Reviewers: raster
CC: cedric
Differential Revision: https://phab.enlightenment.org/D424