Skip to content

Conversation

felipepiovezan
Copy link
Contributor

In architectures where pointers may contain metadata, such as arm64e, it is important to ignore those bits when comparing two different StackIDs, as the metadata may not be the same even if the pointers are.

This patch is a step towards allowing consumers of pointers to decide whether they want to keep or remove metadata, as opposed to discarding metadata at the moment pointers are created. See
#150537.

This was tested running the LLDB test suite on arm64e.

In architectures where pointers may contain metadata, such as arm64e, it
is important to ignore those bits when comparing two different StackIDs,
as the metadata may not be the same even if the pointers are.

This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See
llvm#150537.

This was tested running the LLDB test suite on arm64e.
@llvmbot
Copy link
Member

llvmbot commented Aug 8, 2025

@llvm/pr-subscribers-lldb

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

In architectures where pointers may contain metadata, such as arm64e, it is important to ignore those bits when comparing two different StackIDs, as the metadata may not be the same even if the pointers are.

This patch is a step towards allowing consumers of pointers to decide whether they want to keep or remove metadata, as opposed to discarding metadata at the moment pointers are created. See
#150537.

This was tested running the LLDB test suite on arm64e.


Full diff: https://github.com/llvm/llvm-project/pull/152796.diff

3 Files Affected:

  • (modified) lldb/include/lldb/Target/StackID.h (+5-5)
  • (modified) lldb/source/Target/StackFrame.cpp (+10-6)
  • (modified) lldb/source/Target/StackID.cpp (+18)
diff --git a/lldb/include/lldb/Target/StackID.h b/lldb/include/lldb/Target/StackID.h
index 09392792cb501..fddbc8e48dfdc 100644
--- a/lldb/include/lldb/Target/StackID.h
+++ b/lldb/include/lldb/Target/StackID.h
@@ -14,14 +14,15 @@
 
 namespace lldb_private {
 
+class Process;
+
 class StackID {
 public:
   // Constructors and Destructors
   StackID() = default;
 
   explicit StackID(lldb::addr_t pc, lldb::addr_t cfa,
-                   SymbolContextScope *symbol_scope)
-      : m_pc(pc), m_cfa(cfa), m_symbol_scope(symbol_scope) {}
+                   SymbolContextScope *symbol_scope, Process *process);
 
   StackID(const StackID &rhs)
       : m_pc(rhs.m_pc), m_cfa(rhs.m_cfa), m_symbol_scope(rhs.m_symbol_scope) {}
@@ -63,9 +64,8 @@ class StackID {
 protected:
   friend class StackFrame;
 
-  void SetPC(lldb::addr_t pc) { m_pc = pc; }
-
-  void SetCFA(lldb::addr_t cfa) { m_cfa = cfa; }
+  void SetPC(lldb::addr_t pc, Process *process);
+  void SetCFA(lldb::addr_t cfa, Process *process);
 
   lldb::addr_t m_pc =
       LLDB_INVALID_ADDRESS; // The pc value for the function/symbol for this
diff --git a/lldb/source/Target/StackFrame.cpp b/lldb/source/Target/StackFrame.cpp
index d97a814952186..40049d40aa413 100644
--- a/lldb/source/Target/StackFrame.cpp
+++ b/lldb/source/Target/StackFrame.cpp
@@ -61,8 +61,9 @@ StackFrame::StackFrame(const ThreadSP &thread_sp, user_id_t frame_idx,
                        const SymbolContext *sc_ptr)
     : m_thread_wp(thread_sp), m_frame_index(frame_idx),
       m_concrete_frame_index(unwind_frame_index), m_reg_context_sp(),
-      m_id(pc, cfa, nullptr), m_frame_code_addr(pc), m_sc(), m_flags(),
-      m_frame_base(), m_frame_base_error(), m_cfa_is_valid(cfa_is_valid),
+      m_id(pc, cfa, nullptr, thread_sp->GetProcess().get()),
+      m_frame_code_addr(pc), m_sc(), m_flags(), m_frame_base(),
+      m_frame_base_error(), m_cfa_is_valid(cfa_is_valid),
       m_stack_frame_kind(kind),
       m_behaves_like_zeroth_frame(behaves_like_zeroth_frame),
       m_variable_list_sp(), m_variable_list_value_objects(),
@@ -71,7 +72,7 @@ StackFrame::StackFrame(const ThreadSP &thread_sp, user_id_t frame_idx,
   // recursive functions properly aren't confused with one another on a history
   // stack.
   if (IsHistorical() && !m_cfa_is_valid) {
-    m_id.SetCFA(m_frame_index);
+    m_id.SetCFA(m_frame_index, thread_sp->GetProcess().get());
   }
 
   if (sc_ptr != nullptr) {
@@ -87,7 +88,8 @@ StackFrame::StackFrame(const ThreadSP &thread_sp, user_id_t frame_idx,
                        const SymbolContext *sc_ptr)
     : m_thread_wp(thread_sp), m_frame_index(frame_idx),
       m_concrete_frame_index(unwind_frame_index),
-      m_reg_context_sp(reg_context_sp), m_id(pc, cfa, nullptr),
+      m_reg_context_sp(reg_context_sp),
+      m_id(pc, cfa, nullptr, thread_sp->GetProcess().get()),
       m_frame_code_addr(pc), m_sc(), m_flags(), m_frame_base(),
       m_frame_base_error(), m_cfa_is_valid(true),
       m_stack_frame_kind(StackFrame::Kind::Regular),
@@ -115,7 +117,7 @@ StackFrame::StackFrame(const ThreadSP &thread_sp, user_id_t frame_idx,
       m_concrete_frame_index(unwind_frame_index),
       m_reg_context_sp(reg_context_sp),
       m_id(pc_addr.GetLoadAddress(thread_sp->CalculateTarget().get()), cfa,
-           nullptr),
+           nullptr, thread_sp->GetProcess().get()),
       m_frame_code_addr(pc_addr), m_sc(), m_flags(), m_frame_base(),
       m_frame_base_error(), m_cfa_is_valid(true),
       m_stack_frame_kind(StackFrame::Kind::Regular),
@@ -1995,7 +1997,9 @@ void StackFrame::UpdatePreviousFrameFromCurrentFrame(StackFrame &curr_frame) {
   std::lock_guard<std::recursive_mutex> guard(m_mutex);
   assert(GetStackID() ==
          curr_frame.GetStackID());     // TODO: remove this after some testing
-  m_id.SetPC(curr_frame.m_id.GetPC()); // Update the Stack ID PC value
+  m_id.SetPC(
+      curr_frame.m_id.GetPC(),
+      curr_frame.CalculateProcess().get()); // Update the Stack ID PC value
   assert(GetThread() == curr_frame.GetThread());
   m_frame_index = curr_frame.m_frame_index;
   m_concrete_frame_index = curr_frame.m_concrete_frame_index;
diff --git a/lldb/source/Target/StackID.cpp b/lldb/source/Target/StackID.cpp
index 410d5b7e820a5..b1795970802ae 100644
--- a/lldb/source/Target/StackID.cpp
+++ b/lldb/source/Target/StackID.cpp
@@ -10,10 +10,28 @@
 #include "lldb/Symbol/Block.h"
 #include "lldb/Symbol/Symbol.h"
 #include "lldb/Symbol/SymbolContext.h"
+#include "lldb/Target/Process.h"
 #include "lldb/Utility/Stream.h"
 
 using namespace lldb_private;
 
+StackID::StackID(lldb::addr_t pc, lldb::addr_t cfa,
+                 SymbolContextScope *symbol_scope, Process *process)
+    : m_pc(pc), m_cfa(cfa), m_symbol_scope(symbol_scope) {
+  if (process) {
+    m_pc = process->FixCodeAddress(m_pc);
+    m_cfa = process->FixDataAddress(m_cfa);
+  }
+}
+
+void StackID::SetPC(lldb::addr_t pc, Process *process) {
+  m_pc = process ? process->FixCodeAddress(pc) : pc;
+}
+
+void StackID::SetCFA(lldb::addr_t cfa, Process *process) {
+  m_cfa = process ? process->FixDataAddress(cfa) : cfa;
+}
+
 void StackID::Dump(Stream *s) {
   s->Printf("StackID (pc = 0x%16.16" PRIx64 ", cfa = 0x%16.16" PRIx64
             ", symbol_scope = %p",

Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Yes, I think lldb currently has a mix of "clean metadata bits when we first see an address" and "clean metadata bits before using" (e.g. Process::ReadMemory), and we should have a discussion about whether we should keep a mix of input-time and use-time clearing of bits, or if we should switch to only clearing them at use-time and maintaining metadata until that point. But this brings StackID in line with other use-time clearings, it's a straightforward change.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@felipepiovezan felipepiovezan merged commit 9c8e716 into llvm:main Aug 11, 2025
11 checks passed
@felipepiovezan felipepiovezan deleted the felipe/cleanup_pointers_stackid branch August 11, 2025 15:27
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Sep 2, 2025
In architectures where pointers may contain metadata, such as arm64e, it
is important to ignore those bits when comparing two different StackIDs,
as the metadata may not be the same even if the pointers are.

This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See
llvm#150537.

This was tested running the LLDB test suite on arm64e.

(cherry picked from commit 9c8e716)
medismailben pushed a commit to medismailben/llvm-project that referenced this pull request Sep 5, 2025
In architectures where pointers may contain metadata, such as arm64e, it
is important to ignore those bits when comparing two different StackIDs,
as the metadata may not be the same even if the pointers are.

This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See
llvm#150537.

This was tested running the LLDB test suite on arm64e.

(cherry picked from commit 9c8e716)
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Sep 10, 2025
In this commit:

9c8e716 [lldb] Make StackID call Fix{Code,Data} pointers (llvm#152796)

We made StackID keep track of the CFA without any pointer metadata in
it. This is necessary when comparing two StackIDs to determine which one
is "younger".

However, the CFA inside StackIDs is also used in other contexts through
the method StackID::GetCallFrameAddress. One notable case is
DWARFExpression: the computation of `DW_OP_call_frame_address` is done
using StackID. This feeds into many other places, e.g. expression
evaluation may require the address of a variable that is computed from
the CFA; to access the variable without faulting, we may need to
preserve the pointer metadata. As such, StackID must be able to provide
both versions of the CFA.

In the spirit of allowing consumers of pointers to decide what to do
with pointer metadata, this patch changes StackID to store both versions
of the cfa pointer. Two getter methods are provided, and all call sites
except DWARFExpression preserve their existing behavior (stripped
pointer). Other alternatives were considered:

* Just store the raw pointer. This would require changing the
  comparisong operator `<` to also receive a Process, as the comparison
  requires stripped pointers. It wasn't clear if all call-sites had a
  non-null process, whereas we know we have a process when creating a
  StackID.

* Store a weak pointer to the process inside the class, and then strip
  metadata as needed. This would require a `weak_ptr::lock` in many
  operations of LLDB, and it felt wasteful. It also prevents stripping
  of the pointer if the process has gone away.

This patch also changes RegisterContextUnwind::ReadFrameAddress, which
is the method computing the CFA fed into StackID, to also preserve the
signature pointers.
felipepiovezan added a commit that referenced this pull request Sep 12, 2025
[lldb] Track CFA pointer metadata in StackID

    In this commit:

9c8e716 [lldb] Make StackID call Fix{Code,Data} pointers (#152796)

We made StackID keep track of the CFA without any pointer metadata in
it. This is necessary when comparing two StackIDs to determine which one
    is "younger".

However, the CFA inside StackIDs is also used in other contexts through
    the method StackID::GetCallFrameAddress. One notable case is
DWARFExpression: the computation of `DW_OP_call_frame_address` is done
    using StackID. This feeds into many other places, e.g. expression
evaluation may require the address of a variable that is computed from
    the CFA; to access the variable without faulting, we may need to
preserve the pointer metadata. As such, StackID must be able to provide
    both versions of the CFA.

    In the spirit of allowing consumers of pointers to decide what to do
with pointer metadata, this patch changes StackID to store both versions
of the cfa pointer. Two getter methods are provided, and all call sites
    except DWARFExpression preserve their existing behavior (stripped
    pointer). Other alternatives were considered:

    * Just store the raw pointer. This would require changing the
comparisong operator `<` to also receive a Process, as the comparison
requires stripped pointers. It wasn't clear if all call-sites had a
non-null process, whereas we know we have a process when creating a
      StackID.

* Store a weak pointer to the process inside the class, and then strip
      metadata as needed. This would require a `weak_ptr::lock` in many
operations of LLDB, and it felt wasteful. It also prevents stripping
      of the pointer if the process has gone away.

This patch also changes RegisterContextUnwind::ReadFrameAddress, which
is the method computing the CFA fed into StackID, to also preserve the
    signature pointers.
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Sep 12, 2025
In this commit:

9c8e716 [lldb] Make StackID call Fix{Code,Data} pointers (llvm#152796)

We made StackID keep track of the CFA without any pointer metadata in
it. This is necessary when comparing two StackIDs to determine which one
is "younger".

However, the CFA inside StackIDs is also used in other contexts through
the method StackID::GetCallFrameAddress. One notable case is
DWARFExpression: the computation of `DW_OP_call_frame_address` is done
using StackID. This feeds into many other places, e.g. expression
evaluation may require the address of a variable that is computed from
the CFA; to access the variable without faulting, we may need to
preserve the pointer metadata. As such, StackID must be able to provide
both versions of the CFA.

In the spirit of allowing consumers of pointers to decide what to do
with pointer metadata, this patch changes StackID to store both versions
of the cfa pointer. Two getter methods are provided, and all call sites
except DWARFExpression preserve their existing behavior (stripped
pointer). Other alternatives were considered:

* Just store the raw pointer. This would require changing the
comparisong operator `<` to also receive a Process, as the comparison
requires stripped pointers. It wasn't clear if all call-sites had a
non-null process, whereas we know we have a process when creating a
StackID.

* Store a weak pointer to the process inside the class, and then strip
metadata as needed. This would require a `weak_ptr::lock` in many
operations of LLDB, and it felt wasteful. It also prevents stripping
of the pointer if the process has gone away.

This patch also changes RegisterContextUnwind::ReadFrameAddress, which
is the method computing the CFA fed into StackID, to also preserve the
signature pointers.

(cherry picked from commit 5d088ba)
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request Sep 24, 2025
In this commit:

9c8e716 [lldb] Make StackID call Fix{Code,Data} pointers (llvm#152796)

We made StackID keep track of the CFA without any pointer metadata in
it. This is necessary when comparing two StackIDs to determine which one
is "younger".

However, the CFA inside StackIDs is also used in other contexts through
the method StackID::GetCallFrameAddress. One notable case is
DWARFExpression: the computation of `DW_OP_call_frame_address` is done
using StackID. This feeds into many other places, e.g. expression
evaluation may require the address of a variable that is computed from
the CFA; to access the variable without faulting, we may need to
preserve the pointer metadata. As such, StackID must be able to provide
both versions of the CFA.

In the spirit of allowing consumers of pointers to decide what to do
with pointer metadata, this patch changes StackID to store both versions
of the cfa pointer. Two getter methods are provided, and all call sites
except DWARFExpression preserve their existing behavior (stripped
pointer). Other alternatives were considered:

* Just store the raw pointer. This would require changing the
comparisong operator `<` to also receive a Process, as the comparison
requires stripped pointers. It wasn't clear if all call-sites had a
non-null process, whereas we know we have a process when creating a
StackID.

* Store a weak pointer to the process inside the class, and then strip
metadata as needed. This would require a `weak_ptr::lock` in many
operations of LLDB, and it felt wasteful. It also prevents stripping
of the pointer if the process has gone away.

This patch also changes RegisterContextUnwind::ReadFrameAddress, which
is the method computing the CFA fed into StackID, to also preserve the
signature pointers.

(cherry picked from commit 5d088ba)
(cherry picked from commit c507ec1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants