How to Track DWARF Generation Process

At the beginning of this chapter, we introduced the list of tools actually called during go build execution. DWARF debugging information generation logic is generated by compile and link. This article introduces the general process of DWARF debugging information generation in the compiler compile.

For a relatively unfamiliar project, people might try to understand its execution process through code reading and debugger tracking.

Code Reading Approach

This might be the first approach we think of. Code reading can help us understand the main flow and corner case details. However, when the codebase is large, we need to be careful to exclude irrelevant code, otherwise it's easy to get lost in the code. If readers are not familiar with the project, it becomes even more challenging.

Take Go as an example, considering just the compiler and linker in the compilation toolchain, the Go source code has over 440,000 lines. Although the author is quite familiar with this code, without any tools, code reading can still feel like "getting lost in the forest".

path-to/go/src/cmd $ cloc compile/ link/

     877 text files.
     853 unique files.                                        
      34 files ignored.

github.com/AlDanial/cloc v 2.01  T=1.00 s (853.0 files/s, 561112.0 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Go                              810          28854          72117         442276
Snakemake                        20           1968              0          13760
Markdown                          4            379             23           1313
Text                              6             60              0            146
Assembly                          9             21             35             92
Objective-C                       1              2              3             11
Bourne Shell                      1              5              6             10
Bourne Again Shell                1              7             10              9
MATLAB                            1              1              0              4
--------------------------------------------------------------------------------
SUM:                            853          31297          72194         457621
--------------------------------------------------------------------------------

Let me share some VSCode plugins I commonly use for code reading, which are very helpful when dealing with medium to large projects and long processing logic:

  • bookmarks: Create a branch notes/go1.24, add bookmarks while reading code, with bookmarks named and described following a certain format, such as: "Category": "Bookmark Description". This makes it much easier to review later.
  • codetour: Create a branch notes/go1.24, you can record specific process details you care about for a particular flow. First create a tour, then add descriptions for each step along the way. Later you can replay the key steps of the process one by one.

The bookmarks and tours we add are stored in the .vscode/ directory of the project branch. Remember to commit them to the repository, so when reading code on another device, you can seamlessly continue. The author has been using this method for years and finds it very helpful.

Debugger Tracking Approach

Debugger tracking can skip many unreachable branch logic in the code, but what's special is that the Go compilation toolchain release version has DWARF stripped out, so if you want to debug the Go compilation toolchain itself, you generally can't debug due to missing DWARF debugging information.

One solution is to rebuild the compilation toolchain from source:

# Download Go source code and switch to go1.24 branch
git clone https://github.com/golang/go
cd go
git checkout v1.24

# Modify VERSION file, add 'tests/' before go1.24.0
# At this point, the Go toolchain build process won't strip DWARF generation compiler and linker options
cat >>VERSION<<EOF
tests/go1.24.0
time 2025-02-10T23:33:55Z
EOF

# Build, output will be in path-to/go/bin and path-to/go/pkg/tool/ directories
cd src
./make.bash

After building, you can check the build artifacts:

ls ../bin/ ../pkg/tool/linux_amd64/
../bin/:
go  gofmt

../pkg/tool/linux_amd64/:
addr2line  buildid  compile  cover  distpack  fix   nm       pack   preprofile  trace
asm        cgo      covdata  dist   doc       link  objdump  pprof  test2json   vet

Now you can use readelf -S ../pkg/tool/linux_amd64/compile | grep debug to see that the program contains DWARF debugging information and can be tracked with a debugger.

eBPF Tracking Approach

Those who know me know that I'm someone who likes to constantly break boundaries. I don't like workplace practices that create information barriers. I like being OpenMinded, including risks in service architecture. I don't like managing things through personal notes, I prefer open discussion through issues. Because I tend to believe that if a person has enough information, they can make increasingly reasonable decisions. This is great for both personal and team growth. Because I've been active in the open source community for years, I understand very well how much Open can unleash the potential of excellent individuals. But some people like to work secretly, have small meetings, ask questions without providing context, are "reluctant" to share materials, and don't want others to know about issues in their modules, which I don't like.

When my leader asked me to work towards becoming a TechLead, I started implementing my series of ideas.

  1. You want to hide system issues? OK, let's pull main caller and callee dimension monitoring data from the monitoring platform, establish an SLA dashboard, exposing the success rate of every interface of every service under each person's name on the dashboard;
  2. You want to hide solution issues? OK, let's create a wiki space, moving all system designs, milestone plans, and progress tracking for each subsystem of all teams up there;
  3. You like to hide problems, avoid discussing issues, and "secretly" modify code? OK, let's implement fine-grained management of project team members' code push and merge permissions, code commits must be linked to --story|--bug|--task requirements, issues, or tasks, otherwise push is rejected.
  4. Service interface processing often takes too long, but can't identify which part of the process is causing it? OK, let's integrate opentelemetry at the RPC framework layer, problems will be directly exposed in the tracing visualization interface;
  5. Even more, when load testing disables opentelemetry impact on external systems, sometimes developers' vague explanations about interface delays are unsatisfactory. OK, let's deploy eBPF program go-ftrace on each machine. Whenever I want to look, I can analyze the time cost of each step in the processing logic.
  6. ...
  7. ...
  8. ...
  9. ...
  10. ...

"Newcomers don't know my past, past doesn't need to be told to newcomers" ... hahaha, indeed quite a bit of work has been done. Let's look at the tracking effect of my go-ftrace:

$ sudo ftrace -u 'main.*' -u 'fmt.Print*' ./main 'main.(*Student).String(s.name=(*+0(%ax)):c64, s.name.len=(+8(%ax)):s64, s.age=(+16(%ax)):s64)'
WARN[0000] skip main.main, failed to get ret offsets: no ret offsets 
found 14 uprobes, large number of uprobes (>1000) need long time for attaching and detaching, continue? [Y/n]

>>> press `y` to continue
y
add arg rule at 47cc40: {Type:1 Reg:0 Size:8 Length:1 Offsets:[0 0 0 0 0 0 0 0] Deference:[1 0 0 0 0 0 0 0]}
add arg rule at 47cc40: {Type:1 Reg:0 Size:8 Length:1 Offsets:[8 0 0 0 0 0 0 0] Deference:[0 0 0 0 0 0 0 0]}
add arg rule at 47cc40: {Type:1 Reg:0 Size:8 Length:1 Offsets:[16 0 0 0 0 0 0 0] Deference:[0 0 0 0 0 0 0 0]}
INFO[0002] start tracing                              

...

                           🔬 You can inspect all nested function calls, when and where started or finished
23 17:11:00.0890           main.doSomething() { main.main+15 github/go-ftrace/examples/main.go:10
23 17:11:00.0890             main.add() { main.doSomething+37 github/go-ftrace/examples/main.go:15
23 17:11:00.0890               main.add1() { main.add+149 github/go-ftrace/examples/main.go:27
23 17:11:00.0890                 main.add3() { main.add1+149 github/go-ftrace/examples/main.go:40
23 17:11:00.0890 000.0000        } main.add3+148 github/go-ftrace/examples/main.go:46
23 17:11:00.0890 000.0000      } main.add1+154 github/go-ftrace/examples/main.go:33
23 17:11:00.0890 000.0001    } main.add+154 github/go-ftrace/examples/main.go:27
23 17:11:00.0890             main.minus() { main.doSomething+52 github/go-ftrace/examples/main.go:16
23 17:11:00.0890 000.0000    } main.minus+3 github/go-ftrace/examples/main.go:51

                            🔍 Here, member fields of function receiver extracted, receiver is the 1st argument actually.
23 17:11:00.0891             main.(*Student).String(s.name=zhang<ni, s.name.len=5, s.age=100) { fmt.(*pp).handleMethods+690 /opt/go/src/fmt/print.go:673
23 17:11:00.0891 000.0000    } main.(*Student).String+138 github/go-ftrace/examples/main.go:64
23 17:11:01.0895 001.0005  } main.doSomething+180 github/go-ftrace/examples/main.go:22
                 ⏱️ Here, timecost is displayed at the end of the function call

...

>>> press `Ctrl+C` to quit.

INFO[0007] start detaching                            
detaching 16/16

This eBPF-based tracking tool can be used to analyze Go source code execution history. You don't need to mechanically read code or use a debugger to control execution. You just need to use go-ftrace to track program execution once, and it will output all functions executed during that time. Then you can look at the source code with purpose, achieving twice the result with half the effort!

LLM as a Powerful Tool

Hahaha, now LLM is also a very good method, "hi, please explain this code to me". Indeed, I often use this method now, and it usually provides very positive help.

These are some AI products and large models I frequently use:

  • Website: claude.ai / you.com / chatgpt.com / gemini.google.com / sourcegraph.com
  • App: Tencent Yuanbao / Doubao / Kimi / Gemini
  • LLM: claude / gpt-4o / qwen2.5 / gemma3 / deepseek / hunyuan
  • VSCode Extension: continue / copilot / cody ai / ...
  • Chrome Extension: Page Assist
  • Self-Hosted: Open-WebUI

Other Methods

Developer wisdom is not something I can enumerate completely. What I've listed are some experiences from my personal career. If you have better methods for understanding program execution flow, please feel free to share.

Summary

Readers might initially want to understand debugger development, but after reading these sections, because we've spent considerable space introducing the Go compilation toolchain, they might also want to understand the design and implementation of the Go compilation toolchain, Go runtime, and Go standard library. The author certainly understands how much a technology enthusiast wants to exhaust all details. I understand, so I've shared some methods I found helpful in mastering "details" of medium to large projects during my past similar work and study. If you really need this. Unlike relatively simple CRUD logic in business code, which can be understood by looking at documentation, PPTs, or listening to others' general descriptions, some projects emphasize "precision" and "rigor". I greatly admire technical people who are willing to invest personal time in these seemingly boring details. Your investment in these areas will ultimately continuously enrich your wings, allowing you to fly higher.

ps: When I say higher, I don't mean success in the worldly sense, but rather a kind of "transcendence".

results matching ""

    No results matching ""