Skip to content

Instantly share code, notes, and snippets.

@sammko
Created May 9, 2018 17:32
Show Gist options
  • Save sammko/459a38083f9c6b9aa898dd6d8640b483 to your computer and use it in GitHub Desktop.
Save sammko/459a38083f9c6b9aa898dd6d8640b483 to your computer and use it in GitHub Desktop.

Projekt 2

created by xhanze10

Workflow

  1. Run the crash utility on core dump - crash vmlinux vmcore
KERNEL: vmlinux
DUMPFILE: vmcore  [PARTIAL DUMP]
  CPUS: 8
  DATE: Wed Apr  4 16:22:28 2018
UPTIME: 00:04:21
LOAD AVERAGE: 3.88, 1.80, 0.70
 TASKS: 176
NODENAME: panicker                                          <-- HAHA
RELEASE: 3.10.0-861.el7.PROJECT.v2.x86_64
VERSION: #1 SMP Wed Apr 4 14:56:51 CEST 2018
MACHINE: x86_64  (3591 Mhz)
MEMORY: 4 GB
 PANIC: "SysRq : Trigger a crash"                           <-- Crash was triggered by user by Ctrl+SysRq+c or by writing to /sys
   PID: 10491
COMMAND: "bash"
  TASK: ffff9a6477e0bf40  [THREAD_INFO: ffff9a63f041c000]
   CPU: 0
 STATE: TASK_RUNNING (SYSRQ)    
  1. crash> files
0 ffff9a6478374d00 ffff9a647ac34000 ffff9a64796c7820 CHR  /dev/tty4
1 ffff9a6479300900 ffff9a63f27fd900 ffff9a647ac30fa0 REG  /proc/sysrq-trigger
2 ffff9a6478374d00 ffff9a647ac34000 ffff9a64796c7820 CHR  /dev/tty4
10 ffff9a6478374d00 ffff9a647ac34000 ffff9a64796c7820 CHR  /dev/tty4
255 ffff9a6478374d00 ffff9a647ac34000 ffff9a64796c7820 CHR  /dev/tty4

User triggered the crash by writing to /proc/sysrq-trigger from tty4

  1. Running crash> log shows that 3 tasks (ethtool:10433, dhclient:10462, ip:10485) have been blocked for more than 120s

  2. changing current context to each of them and running crash> bt shows they all share first/last 4 function calls

#0 [...] __schedule at ffffffff83112444
#1 [...] schedule_preempt_disabled at ffffffff831139d9
#2 [...] __mutex_lock_slowpath at ffffffff83111797
#3 [...] mutex_lock at ffffffff83110b7f

This indicates a deadlock.

  1. for both dhclient and ip the mutex_lock fn is called from rtnetlink_rcv which locks the rtnl_mutex mutex for ethtool the last called fn before calling a lock is e1000_release_eeprom which locks the e1000_eeprom_lock mutex

  2. crash> print rtnl_mutex

...
owner: 0xffff9a647594cf10
...
  1. crash> print e1000_eeprom_lock
...
owner: 0xffff9a647594cf10
...
  1. crash> struct task_info 0xffff9a647594cf10
...
pid: 10433 <-- the ethtool process
...

Now we see that the ethtool has both mutexes locked has has frozen at locking e1000_eeprom_lock. This which means the error will be in the ethtool backtrace.

PID: 10433  TASK: ffff9a647594cf10  CPU: 4   COMMAND: "ethtool"
 #0 [ffff9a63f949fa78] __schedule at ffffffff83112444
 #1 [ffff9a63f949fb08] schedule_preempt_disabled at ffffffff831139d9
 #2 [ffff9a63f949fb18] __mutex_lock_slowpath at ffffffff83111797
 #3 [ffff9a63f949fb78] mutex_lock at ffffffff83110b7f
 #4 [ffff9a63f949fb90] e1000_release_eeprom at ffffffffc018b51c [e1000]
 #5 [ffff9a63f949fbb0] e1000_read_eeprom at ffffffffc018e475 [e1000]
 #6 [ffff9a63f949fbf0] e1000_get_eeprom at ffffffffc01931b9 [e1000]
 #7 [ffff9a63f949fc48] ethtool_get_any_eeprom at ffffffff82ff3d4f
 #8 [ffff9a63f949fcb0] dev_ethtool at ffffffff82ff5762
 #9 [ffff9a63f949fd98] dev_ioctl at ffffffff83006bdf                    <-- this is where the fun part begins
#10 [ffff9a63f949fe28] sock_do_ioctl at ffffffff82fca4bd
#11 [ffff9a63f949fe50] sock_ioctl at ffffffff82fca6c8
#12 [ffff9a63f949fe80] do_vfs_ioctl at ffffffff82c2fb90
#13 [ffff9a63f949ff00] sys_ioctl at ffffffff82c2fe41
#14 [ffff9a63f949ff50] system_call_fastpath at ffffffff8311f7d5
    RIP: 00007f47de672567  RSP: 00007ffd0f79a8b0  RFLAGS: 00010246
    RAX: 0000000000000010  RBX: 00007ffd0f79ac40  RCX: 0000000000000000
    RDX: 00007ffd0f79ac50  RSI: 0000000000008946  RDI: 0000000000000003
    RBP: 00000000019ea010   R8: 00000000019ea000   R9: 0000000000000090
    R10: 0000000000021000  R11: 0000000000000246  R12: 0000000000000080
    R13: 00007ffd0f79ac50  R14: 0000000000000000  R15: 00007ffd0f79ab20
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b
  1. Investigating the backtrace we can see dev_ethtool function calls ethtool_get_eeprom which in return calls ethtool_get_eeprom, which calls ethtool_get_any_eeprom with a getter function set to the one from driver - e1000_get_eeprom (drivers/net/ethernet/intel/e1000/e1000_ethtool.c:443) which in return calls e1000_read_eeprom (drivers/net/ethernet/intel/e1000/e1000_hw.c:3890)

  2. e1000_read_eeprom locks the e1000_eeprom_lock mutex and calls the e1000_do_read_eeprom function which reads the data and calls the e1000_release_eeprom (note that the e1000_eeprom_lock mutex is still locked by e1000_read_eeprom)

  3. After doing some HW-specific operations the functions tries to lock the e1000_eeprom_lock before stopping to request EEPROM access in case the nested_mutex is set to a non-zero value

    Which it is because: of this line (/net/core/ethtool.c:1552 - ethtool_get_any_eeprom):

    ...
    *data = !strcmp("eth1", dev->name) ? 1 : 0;
    ...
    

    which sets data to 1 if device name is eth1. This variable is than passwd to e1000_get_eeprom as its 3rd parameter (u8 *bytes), which is afterwards used like so

    nested_mutex = *bytes;
    

    The process is now trying to lock nested_mutex, which it already owns, so it blocks itself.

  4. And the reason both dhclient or ip processes are locked as well is that before calling dev_ethtool dev_ioctl calls rtnl_lock locking the rtnl_mutex.

Conclusion

One of the possible solutions would be to move the e1000_release_eeprom out of the e1000_do_..._eeprom functions to their callers:

diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.c b/drivers/net/ethernet/intel/e1000/e1000_hw.c
index bfe5625..b2580d3 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_hw.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_hw.c
@@ -3894,6 +3894,10 @@ s32 e1000_read_eeprom(struct e1000_hw *hw, u16 offset, u16 words, u16 *data)
        mutex_lock(&e1000_eeprom_lock);
        ret = e1000_do_read_eeprom(hw, offset, words, data);
        mutex_unlock(&e1000_eeprom_lock);
+
+       /* End this read operation */
+       e1000_release_eeprom(hw);
+
        return ret;
 }

@@ -3982,9 +3986,6 @@ static s32 e1000_do_read_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
                }
        }

-       /* End this read operation */
-       e1000_release_eeprom(hw);
-
        return E1000_SUCCESS;
 }

@@ -4067,6 +4068,10 @@ s32 e1000_write_eeprom(struct e1000_hw *hw, u16 offset, u16 words, u16 *data)
        mutex_lock(&e1000_eeprom_lock);
        ret = e1000_do_write_eeprom(hw, offset, words, data);
        mutex_unlock(&e1000_eeprom_lock);
+
+       /* Done with writing */
+       e1000_release_eeprom(hw);
+
        return ret;
 }

@@ -4103,9 +4108,6 @@ static s32 e1000_do_write_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
                msleep(10);
        }

-       /* Done with writing */
-       e1000_release_eeprom(hw);
-
        return status;
 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment