Commit 406d24b5 authored by Tanya Lattner's avatar Tanya Lattner
Browse files

Merge 97980 from mainline.

Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll.

llvm-svn: 98313
parent 6fdd7771
Loading
Loading
Loading
Loading
+45 −0
Original line number Diff line number Diff line
@@ -86,6 +86,7 @@
  <li><a href="#targetimpls">Target-specific Implementation Notes</a>
    <ul>
    <li><a href="#tailcallopt">Tail call optimization</a></li>
    <li><a href="#sibcallopt">Sibling call optimization</a></li>
    <li><a href="#x86">The X86 backend</a></li>
    <li><a href="#ppc">The PowerPC backend</a>
      <ul>
@@ -1732,6 +1733,50 @@ define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
   (because one or more of above constraints are not met) to be followed by a
   readjustment of the stack. So performance might be worse in such cases.</p>

</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
  <a name="sibcallopt">Sibling call optimization</a>
</div>

<div class="doc_text">

<p>Sibling call optimization is a restricted form of tail call optimization.
   Unlike tail call optimization described in the previous section, it can be
   performed automatically on any tail calls when <tt>-tailcallopt</tt> option
   is not specified.</p>

<p>Sibling call optimization is currently performed on x86/x86-64 when the
   following constraints are met:</p>

<ul>
  <li>Caller and callee have the same calling convention. It can be either
      <tt>c</tt> or <tt>fastcc</tt>.

  <li>The call is a tail call - in tail position (ret immediately follows call
      and ret uses value of call or is void).</li>

  <li>Caller and callee have matching return type or the callee result is not
      used.

  <li>If any of the callee arguments are being passed in stack, they must be
      available in caller's own incoming argument stack and the frame offsets
      must be the same.
</ul>

<p>Example:</p>
<div class="doc_code">
<pre>
declare i32 @bar(i32, i32)

define i32 @foo(i32 %a, i32 %b, i32 %c) {
entry:
  %0 = tail call i32 @bar(i32 %a, i32 %b)
  ret i32 %0
}
</pre>
</div>

</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
+5 −2
Original line number Diff line number Diff line
@@ -5169,8 +5169,11 @@ Loop: ; Infinite loop that counts from 0 on up...
      a <a href="#i_ret"><tt>ret</tt></a> instruction.  If the "tail" marker is
      present, the function call is eligible for tail call optimization,
      but <a href="CodeGenerator.html#tailcallopt">might not in fact be
      optimized into a jump</a>.  As of this writing, the extra requirements for
      a call to actually be optimized are:
      optimized into a jump</a>.  The code generator may optimize calls marked
      "tail" with either 1) automatic <a href="CodeGenerator.html#sibcallopt">
      sibling call optimization</a> when the caller and callee have
      matching signatures, or 2) forced tail call optimization when the
      following extra requirements are met:
      <ul>
        <li>Caller and callee both have the calling
            convention <tt>fastcc</tt>.</li>