Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
O
OpenXG-RAN
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
1
Merge Requests
1
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Operations
Operations
Metrics
Environments
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
OpenXG
OpenXG-RAN
Commits
fe9d871a
Commit
fe9d871a
authored
Mar 11, 2024
by
Robert Schmidt
Browse files
Options
Browse Files
Download
Plain Diff
Merge remote-tracking branch 'origin/doc-ci-segv-debug' into integration_2024_w10
parents
127ae812
e615321e
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
114 additions
and
33 deletions
+114
-33
doc/TESTBenches.md
doc/TESTBenches.md
+58
-33
docker/debug_core_image.sh
docker/debug_core_image.sh
+56
-0
No files found.
doc/TESTBenches.md
View file @
fe9d871a
...
...
@@ -255,7 +255,44 @@ Some tests are run from source (e.g.
`ci-scripts/xml_files/gnb_phytest_usrp_run.xml`
), which directly give the
options they are run with.
## How to retrieve core dumps (for CI team members)
## How to debug CI failures
It is possible to debug CI failures using the generated core dump and the image
used for the run. A script is provided (see developer instructions below) that,
provided the core dump file, container image, and the source tree, executes
`gdb`
inside the container; using the core dump information, a developer can
investigate the cause of failure.
### Developer instructions
The CI team will send you a docker image and a core dump file, and the commit
as of which the pipeline failed. Let's assume the coredump is stored at
`/tmp/coredump.tar.xz`
, and the image is in
`/tmp/oai-nr-ue.tar.gz`
. First, you
should check out the corresponding branch (or directly the commit), let's say
in
`~/oai-branch-fail`
. Now, unpack the core dump, load the image into docker,
and use the script
[
`docker/debug_core_image.sh`
](
../docker/debug_core_image.sh
)
to open gdb, as follows:
```
cd /tmp
tar -xJf /tmp/coredump.tar.xz
docker load < /tmp/oai-nr-ue.tar.gz
~/oai-branch-fail/docker/debug_core_image.sh <image> /tmp/coredump ~/oai-branch-fail
```
where you replace
`<image>`
with the image loaded in
`docker load`
. The script
will start the container and open gdb; you should see information about where
the failure (e.g., segmentation fault) happened. If you just see
`??`
, the core
dump and container image don't match. Be also on the lookout for the
corresponding message from gdb:
```
warning: core file may not match specified executable file.
```
Once you quit
`gdb`
, the container image will be removed automatically.
### CI team instructions
The entrypoint scripts of all containers print the core pattern that is used on
the running machine. Search for
`core_pattern`
at the start of the container
...
...
@@ -267,19 +304,14 @@ logs to retrieve the possible location. Possible locations might be:
-
abrt: see
[
documentation
](
https://abrt.readthedocs.io/en/latest/usage.html
)
-
apport: see
[
documentation
](
https://wiki.ubuntu.com/Apport
)
You furthermore have to extract the executable that caused the core dump.
Download the container image, and extract, e.g.:
See below for instructions on how to retrieve the core dump. Further, download
the image and store it to a file using
`docker save`
. Make sure to pick the
right image (Ubuntu or RHEL)!
```
docker create --name c1 porcepix.sboai.cs.eurecom.fr/oai-gnb:develop-c99db698
docker cp c1:/opt/oai-gnb/bin/nr-softmodem /tmp
docker rm c1
```
### Core dump in a file
#### Core dump in a file
**
This is not recommended, as files could pile up and fill the system disk
completely!
**
Prefer
systemd or abrt instead
.
completely!
**
Prefer
another method further down
.
If the core pattern is a path: it should at least include the time in the
pattern name (suggested pattern:
`/tmp/core.%e.%p.%t`
) to correlate the time
...
...
@@ -287,38 +319,31 @@ the segfault occurred with the CI logs. If you identified the core dump,
copy the core dump from that machine; if identification is difficult, consider
rerunning the pipeline.
### Core dump via systemd
###
#
Core dump via systemd
Run this command to list all core dumps:
Use the first command to list all core dumps. Scroll down to the core dump of
interest (it lists the executables in the last column; use the time to
correlate the segfault and the CI run). Take the PID of the executable (first
column after the time). Dump the core dump to a location of your choice.
```
sudo coredumpctl list
```
Scroll to the end and find the core dump of interest (it lists the executables
in the last column; use the time to correlate the segfault and the CI run).
Take the PID of the executable (first column after the time). Dump the core
dump to a location of your choice:
```
sudo coredumpctl dump <PID> > /tmp/coredump
```
### Core dump via abrt (automatic bug reporting tool)
###
#
Core dump via abrt (automatic bug reporting tool)
TBD: use the documentation page for the moment.
### Core dump via apport
On Ubuntu machines, apport first needs to be enabled to collect core dumps:
```
sudo systemctl enable apport.service
```
and
[
needs to be enabled
](
https://wiki.ubuntu.com/Apport#How_to_enable_apport
)
.
Then, show a list of core dumps using
#### Core dump via apport
I did not find an easy way to use apport. Anyway, the systemd approach works
fine. So remove apport, install systemd-coredump, and verify it is the new
coredump handler:
```
sudo apport-cli
sudo systemctl stop apport
sudo systemctl mask --now apport
sudo apt install systemd-coredump
# Verify this changed the core pattern to a pipe to systemd-coredump
sysctl kernel.core_pattern
```
docker/debug_core_image.sh
0 → 100755
View file @
fe9d871a
#!/bin/bash
if
[
$#
-ne
3
]
;
then
echo
"usage:
$0
<image> <coredump> <path-to-sources>"
exit
1
fi
die
()
{
echo
$1
exit
1
}
IMAGE
=
$1
COREDUMP
=
$2
SOURCES
=
$3
set
-x
# the image/build_oai builds in cmake_targets/ran_build/build, so source
# information is relative to this path. In case the user did not compile on
# their computer, this directory will not exist. still allow to find it by
# creating it
BUILD_DIR
=
$SOURCES
/cmake_targets/ran_build/build
mkdir
-p
$BUILD_DIR
||
die
"cannot create
$BUILD_DIR
: is
$SOURCES
valid?"
# check if coredump is valid file
[
-f
$COREDUMP
]
||
die
"no such file:
$COREDUMP
"
# check if image exists, and determine type (gnb, nr-ue) for correct invocation
# of gdb
docker image inspect
$IMAGE
>
/dev/null
||
exit
1
if
[
$(
grep
"oai-gnb:"
<<<
$IMAGE
)
]
||
[
$(
grep
"oai-gnb-aerial:"
<<<
$IMAGE
)
]
;
then
EXEC
=
bin/nr-softmodem
TYPEPATH
=
oai-gnb
elif
[
$(
grep
"oai-nr-ue:"
<<<
$IMAGE
)
]
;
then
EXEC
=
bin/nr-uesoftmodem
TYPEPATH
=
oai-nr-ue
elif
[
$(
grep
"oai-enb:"
<<<
$IMAGE
)
]
;
then
EXEC
=
bin/lte-softmodem
TYPEPATH
=
oai-enb
elif
[
$(
grep
"oai-lte-ue:"
<<<
$IMAGE
)
]
;
then
EXEC
=
bin/lte-uesoftmodem
TYPEPATH
=
oai-lte-ue
else
die
"cannot determine if image is gnb or nr-ue: must match
\"
oai-gnb:
\"
or
\"
oai-nr-ue:
\"
"
fi
# run gdb inside a container. We mount the coredump and the sources inside the
# container, and run gdb with the core dump, the correct executable, and using
# the source directory to show correct line numbers
docker run
--rm
-it
\
-v
$COREDUMP
:/tmp/coredump
\
-v
$SOURCES
:/opt/
$TYPEPATH
/src
\
--entrypoint
bash
\
$IMAGE
\
-c
"gdb --dir=src/cmake_targets/ran_build/build
$EXEC
/tmp/coredump"
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment