This project is mirrored from https://github.com/llvm-doe-org/llvm-project.git.
Pull mirroring updated .
- 28 Jan, 2020 2 commits
-
-
Wang, Pengfei authored
Summary: Support pragma FP_CONTRACT under strict FP. Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3 Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72820
-
Guillaume Chatelet authored
Summary: This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++. A follow up CL will add the intrinsics in Clang. Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71710
-
- 22 Jan, 2020 1 commit
-
-
Sander de Smalen authored
In LLVM IR, vscale can be represented with an intrinsic. For some targets, this is equivalent to the constexpr: getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1 This can be used to propagate the value in CodeGenPrepare. In ISel we add a node that can be legalized to one or more instructions to materialize the runtime vector length. This patch also adds SVE CodeGen support for VSCALE, which maps this node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD] instructions (scaled multiples of 2, 4, or 8 bytes, respectively). Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner Reviewed by: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68203
-
- 16 Jan, 2020 1 commit
-
-
Florian Hahn authored
llvm.memset intrinsics do only write memory, but are missing IntrWriteMem, so they doesNotReadMemory() returns false for them. The test change is due to the test checking the fn attribute ids at the call sites, which got bumped up due to a new combination with writeonly appearing in the test file. Reviewers: jdoerfert, reames, efriedma, nlopes, lebedev.ri Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72789
-
- 13 Jan, 2020 1 commit
-
-
Sam Parker authored
Note that the intrinsic is now understood by SCEV and that other optimisations can treat it as a sub.
-
- 08 Jan, 2020 1 commit
-
-
Bevin Hansson authored
Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: llvm.sdiv.fix.* llvm.udiv.fix.* These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Patch by: ebevhan Reviewers: bjope, leonardchan, efriedma, craig.topper Reviewed By: craig.topper Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70007
-
- 28 Dec, 2019 1 commit
-
-
Fangrui Song authored
-
- 18 Dec, 2019 1 commit
-
-
Ulrich Weigand authored
Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624
-
- 17 Dec, 2019 2 commits
-
-
Ulrich Weigand authored
The following intrinsics currently carry a rounding mode metadata argument: llvm.experimental.constrained.minnum llvm.experimental.constrained.maxnum llvm.experimental.constrained.ceil llvm.experimental.constrained.floor llvm.experimental.constrained.round llvm.experimental.constrained.trunc This is not useful since the semantics of those intrinsics do not in any way depend on the rounding mode. In similar cases, other constrained intrinsics do not have the rounding mode argument. Remove it here as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71218
-
Kevin P. Neal authored
of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275
-
- 12 Dec, 2019 1 commit
-
-
Florian Hahn authored
This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456
-
- 07 Dec, 2019 1 commit
-
-
Ulrich Weigand authored
This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281
-
- 08 Nov, 2019 1 commit
-
-
Philip Reames authored
The change itself is straight forward and obvious, but ... there's an existing test checking for exactly the opposite. Both I and Artur think this is simply conservatism in the initial implementation. If anyone bisects a problem to this, a counter example will be very interesting. Differential Revision: https://reviews.llvm.org/D69907
-
- 14 Oct, 2019 1 commit
-
-
Victor Campos authored
Fixing typo in comment line. llvm-svn: 374766
-
- 07 Oct, 2019 1 commit
-
-
Kevin P. Neal authored
Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900
-
- 02 Oct, 2019 1 commit
-
-
Kerry McLaughlin authored
Summary: This allows intrinsics such as the following to be defined: - declare <n x 4 x i32> @llvm.something.nxv4f32(<n x 4 x i32>, <n x 4 x i1>, <n x 4 x float>) ...where <n x 4 x i32> is derived from <n x 4 x float>, but the element needs bitcasting to int. Reviewers: c-rhodes, sdesmalen, rovka Reviewed By: c-rhodes Subscribers: tschuett, hiraditya, jdoerfert, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68021 llvm-svn: 373437
-
- 27 Sep, 2019 1 commit
-
-
Peter Collingbourne authored
We can't use short granules with stack instrumentation when targeting older API levels because the rest of the system won't understand the short granule tags stored in shadow memory. Moreover, we need to be able to let old binaries (which won't understand short granule tags) run on a new system that supports short granule tags. Such binaries will call the __hwasan_tag_mismatch function when their outlined checks fail. We can compensate for the binary's lack of support for short granules by implementing the short granule part of the check in the __hwasan_tag_mismatch function. Unfortunately we can't do anything about inline checks, but I don't believe that we can generate these by default on aarch64, nor did we do so when the ABI was fixed. A new function, __hwasan_tag_mismatch_v2, is introduced that lets code targeting the new runtime avoid redoing the short granule check. Because tag mismatches are rare this isn't important from a performance perspective; the main benefit is that it introduces a symbol dependency that prevents binaries targeting the new runtime from running on older (i.e. incompatible) runtimes. Differential Revision: https://reviews.llvm.org/D68059 llvm-svn: 373035
-
- 20 Sep, 2019 1 commit
-
-
Kerry McLaughlin authored
Summary: Both match the type of another intrinsic parameter of a vector type, but where each element is subdivided to form a vector with more elements of a smaller type. Subdivide2Argument allows intrinsics such as the following to be defined: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 8 x i16>) Subdivide4Argument allows intrinsics such as: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 16 x i8>) Tests are included in follow up patches which add intrinsics using these types. Reviewers: sdesmalen, SjoerdMeijer, greened, rovka Reviewed By: sdesmalen Subscribers: rovka, tschuett, jdoerfert, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67549 llvm-svn: 372380
-
- 07 Sep, 2019 1 commit
-
-
Bjorn Pettersson authored
Summary: Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Patch by: leonardchan, bjope Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel Reviewed By: leonardchan Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57836 llvm-svn: 371308
-
- 28 Aug, 2019 1 commit
-
-
Kevin P. Neal authored
This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228
-
- 15 Aug, 2019 1 commit
-
-
Florian Hahn authored
This patch adds a ptrmask intrinsic which allows masking out bits of a pointer that must be zero when accessing it, because of ABI alignment requirements or a restriction of the meaningful bits of a pointer through the data layout. This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged pointers) and allows us to not lose information about the underlying object. Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune Reviewed by: sanjoy, jdoerfert Differential Revision: https://reviews.llvm.org/D59065 llvm-svn: 368986
-
- 14 Aug, 2019 4 commits
-
-
David Bolvansky authored
Reviewers: jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D66158 llvm-svn: 368810
-
John McCall authored
These rely on having an allocator provided to the coroutine and thus, for now, only work in retcon lowerings. llvm-svn: 368791
-
John McCall authored
Generalize llvm.coro.suspend.retcon to allow an arbitrary number of arguments to be passed back to the continuation function. llvm-svn: 368789
-
John McCall authored
A quick contrast of this ABI with the currently-implemented ABI: - Allocation is implicitly managed by the lowering passes, which is fine for frontends that are fine with assuming that allocation cannot fail. This assumption is necessary to implement dynamic allocas anyway. - The lowering attempts to fit the coroutine frame into an opaque, statically-sized buffer before falling back on allocation; the same buffer must be provided to every resume point. A buffer must be at least pointer-sized. - The resume and destroy functions have been combined; the continuation function takes a parameter indicating whether it has succeeded. - Conversely, every suspend point begins its own continuation function. - The continuation function pointer is directly returned to the caller instead of being stored in the frame. The continuation can therefore directly destroy the frame when exiting the coroutine instead of having to leave it in a defunct state. - Other values can be returned directly to the caller instead of going through a promise allocation. The frontend provides a "prototype" function declaration from which the type, calling convention, and attributes of the continuation functions are taken. - On the caller side, the frontend can generate natural IR that directly uses the continuation functions as long as it prevents IPO with the coroutine until lowering has happened. In combination with the point above, the frontend is almost totally in charge of the ABI of the coroutine. - Unique-yield coroutines are given some special treatment. llvm-svn: 368788
-
- 30 Jul, 2019 1 commit
-
-
Hideto Ueno authored
Summary: In D37215, AssumeLikeInstruction are regarded as `willreturn`. In this patch, annotation is added to those which don't have `willreturn` now(`sideeffect, object_size, experimental_widenable_condition`). Reviewers: jdoerfert, nikic, sstefan1 Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65455 llvm-svn: 367342
-
- 28 Jul, 2019 1 commit
-
-
Hideto Ueno authored
Summary: In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef). In this patch, willreturn is annotated for LLVM intrinsics. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64904 llvm-svn: 367184
-
- 25 Jul, 2019 1 commit
-
-
JF Bastien authored
Summary: This is useful for targets which have prefetch instructions for non-default address spaces. <rdar://problem/42662136> Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65254 llvm-svn: 367032
-
- 22 Jul, 2019 1 commit
-
-
Christudasan Devadasan authored
Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679
-
- 17 Jul, 2019 1 commit
-
-
Hideto Ueno authored
Summary: Deduce the "willreturn" attribute for functions. For now, intrinsics are not willreturn. More annotation will be done in another patch. Reviewers: jdoerfert Subscribers: jvesely, nhaehnle, nicholas, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63046 llvm-svn: 366335
-
- 12 Jul, 2019 1 commit
-
-
Stanislav Mekhanoshin authored
These should really use v32f32, but were defined as v32i32 due to the lack of the v32f32 type. Differential Revision: https://reviews.llvm.org/D64667 llvm-svn: 365972
-
- 09 Jul, 2019 1 commit
-
-
Yonghong Song authored
For background of BPF CO-RE project, please refer to http://vger.kernel.org/bpfconf2019.html In summary, BPF CO-RE intends to compile bpf programs adjustable on struct/union layout change so the same program can run on multiple kernels with adjustment before loading based on native kernel structures. In order to do this, we need keep track of GEP(getelementptr) instruction base and result debuginfo types, so we can adjust on the host based on kernel BTF info. Capturing such information as an IR optimization is hard as various optimization may have tweaked GEP and also union is replaced by structure it is impossible to track fieldindex for union member accesses. Three intrinsic functions, preserve_{array,union,struct}_access_index, are introducted. addr = preserve_array_access_index(base, index, dimension) addr = preserve_union_access_index(base, di_index) addr = preserve_struct_access_index(base, gep_index, di_index) here, base: the base pointer for the array/union/struct access. index: the last access index for array, the same for IR/DebugInfo layout. dimension: the array dimension. gep_index: the access index based on IR layout. di_index: the access index based on user/debuginfo types. For example, for the following example, $ cat test.c struct sk_buff { int i; int b1:1; int b2:2; union { struct { int o1; int o2; } o; struct { char flags; char dev_id; } dev; int netid; } u[10]; }; static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) = (void *) 4; #define _(x) (__builtin_preserve_access_index(x)) int bpf_prog(struct sk_buff *ctx) { char dev_id; bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id)); return dev_id; } $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \ test.c >& log The generated IR looks like below: ... define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 { %2 = alloca %struct.sk_buff*, align 8 %3 = alloca i8, align 1 store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45 call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49 call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50 call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51 %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45 %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45 %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs( %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19 %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons( [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53 %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons( %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26 %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53 %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s( %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34 %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52 %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55 %13 = sext i8 %12 to i32, !dbg !54 call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56 ret i32 %13, !dbg !57 } !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20) !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27) !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35) Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index attached to instructions to provide struct/union debuginfo type information. For &ctx->u[5].dev.dev_id, . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout. . The "%7 = ..." represents array subscript "5". . The "%8 = ..." represents union member "dev" with index 1 for DI layout. . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout. Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and examining all preserve_*_access_index calls, the debuginfo struct/union/array access index can be achieved. The intrinsics also contain enough information to regenerate codes for IR layout. For array and structure intrinsics, the proper GEP can be constructed. For union intrinsics, replacing all uses of "addr" with "base" should be enough. The test case ThinLTO/X86/lazyload_metadata.ll is adjusted to reflect the new addition of the metadata. Signed-off-by:
Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61810 llvm-svn: 365423
-
- 08 Jul, 2019 2 commits
-
-
Yonghong Song authored
This reverts commit r365352. Test ThinLTO/X86/lazyload_metadata.ll failed. Revert the commit and at the same time to fix the issue. llvm-svn: 365360
-
Yonghong Song authored
For background of BPF CO-RE project, please refer to http://vger.kernel.org/bpfconf2019.html In summary, BPF CO-RE intends to compile bpf programs adjustable on struct/union layout change so the same program can run on multiple kernels with adjustment before loading based on native kernel structures. In order to do this, we need keep track of GEP(getelementptr) instruction base and result debuginfo types, so we can adjust on the host based on kernel BTF info. Capturing such information as an IR optimization is hard as various optimization may have tweaked GEP and also union is replaced by structure it is impossible to track fieldindex for union member accesses. Three intrinsic functions, preserve_{array,union,struct}_access_index, are introducted. addr = preserve_array_access_index(base, index, dimension) addr = preserve_union_access_index(base, di_index) addr = preserve_struct_access_index(base, gep_index, di_index) here, base: the base pointer for the array/union/struct access. index: the last access index for array, the same for IR/DebugInfo layout. dimension: the array dimension. gep_index: the access index based on IR layout. di_index: the access index based on user/debuginfo types. For example, for the following example, $ cat test.c struct sk_buff { int i; int b1:1; int b2:2; union { struct { int o1; int o2; } o; struct { char flags; char dev_id; } dev; int netid; } u[10]; }; static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) = (void *) 4; #define _(x) (__builtin_preserve_access_index(x)) int bpf_prog(struct sk_buff *ctx) { char dev_id; bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id)); return dev_id; } $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \ test.c >& log The generated IR looks like below: ... define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 { %2 = alloca %struct.sk_buff*, align 8 %3 = alloca i8, align 1 store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45 call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49 call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50 call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51 %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45 %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45 %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs( %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19 %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons( [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53 %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons( %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26 %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53 %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s( %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34 %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52 %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55 %13 = sext i8 %12 to i32, !dbg !54 call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56 ret i32 %13, !dbg !57 } !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20) !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27) !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35) Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index attached to instructions to provide struct/union debuginfo type information. For &ctx->u[5].dev.dev_id, . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout. . The "%7 = ..." represents array subscript "5". . The "%8 = ..." represents union member "dev" with index 1 for DI layout. . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout. Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and examining all preserve_*_access_index calls, the debuginfo struct/union/array access index can be achieved. The intrinsics also contain enough information to regenerate codes for IR layout. For array and structure intrinsics, the proper GEP can be constructed. For union intrinsics, replacing all uses of "addr" with "base" should be enough. Signed-off-by:
Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61810 llvm-svn: 365352
-
- 28 Jun, 2019 1 commit
-
-
Sam Parker authored
Introduce llvm.test.set.loop.iterations which sets the loop counter and also produces an i1 after testing that the count is not zero. Differential Revision: https://reviews.llvm.org/D63809 llvm-svn: 364628
-
- 13 Jun, 2019 1 commit
-
-
Sander de Smalen authored
This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240
-
- 11 Jun, 2019 1 commit
-
-
Sander de Smalen authored
This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
-
- 07 Jun, 2019 1 commit
-
-
Sam Parker authored
Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774
-
- 28 May, 2019 1 commit
-
-
Adhemerval Zanella authored
This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D62017 llvm-svn: 361875
-
- 21 May, 2019 1 commit
-
-
Leonard Chan authored
Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55720 llvm-svn: 361289
-