最近配了一台自用的编译和测试主机
操作系统:ubuntu 18.04
i7-10700 + b460m迫击炮 + mellanox ConnectX-4 Lx 网卡;
这块网卡就有两个问题:
1. 开机就很烫(不管有没有ifconfig up),手摸不得。
2. 直接Reboot之后,网卡消失了,lspci都识别不到。
降温
为了降温,特地给机箱加了暴力风扇,默认转速太吵,又通过微星的bios调低转速,噪音能接受,网卡虽然还是热,但是不会烫手了。
想通过lm-sensor监控温度失败了,永远只有CPU温度,没有别的。sensors-detect也没用。
root@ckun-MS-2:~# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +119.0°C)
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +27.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +26.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +26.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +25.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +26.0°C (high = +80.0°C, crit = +100.0°C)
Core 4: +26.0°C (high = +80.0°C, crit = +100.0°C)
Core 5: +27.0°C (high = +80.0°C, crit = +100.0°C)
Core 6: +26.0°C (high = +80.0°C, crit = +100.0°C)
Core 7: +25.0°C (high = +80.0°C, crit = +100.0°C)
功夫不负有心人,网上找到这么一个东西,可以看到温度。
https://www.mellanox/pdf/MFT/MFT_Linux_release_notes_3_0_0.pdf
看样子得装驱动和固件了,开干。
安装驱动
https://www.mellanox/products/infiniband-drivers/linux/mlnx_ofed官网下载对应的驱动 MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64.tgz
解压后
./mlnxofedinstall
直接安装,会将所需依赖库自动网络安装,而且会自动升级网卡固件
Initializing...
Attempting to perform Firmware update...
Querying Mellanox devices firmware ...
Device #1:
----------
Device Type: ConnectX4LX
Part Number: MCX4121A-ACA_Ax
Description: ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
PSID: MT_2420110034
PCI Device Name: 04:00.0
Base MAC: 1c34da6d6d62
Versions: Current Available
FW 14.26.1040 14.28.1002
PXE 3.5.0803 3.6.0101
UEFI 14.19.0014 14.21.0016
Status: Update required
---------
Found 1 device(s) requiring firmware update...
Device #1: Updating FW ...
Initializing image partition - OK
Writing Boot image component - OK
Done
Restart needed for updates to take effect.
Log File: /tmp/MLNX_OFED_LINUX.12542.logs/fw_update.log
Device (04:00.0):
04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
Link Width: x4 ( WARNING - device supports x8 )
PCI Link Speed: 8GT/s
Device (04:00.1):
04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
Link Width: x4 ( WARNING - device supports x8 )
PCI Link Speed: 8GT/s
Installation passed successfully
To load the new driver, run:
/etc/init.d/openibd restart
root@ckun-MS-2:~/pkgs/MLNX_OFED_LINUX-5.1-0.6.6.0-ubuntu18.04-x86_64# /etc/init.d/openibd restart
Unloading HCA driver: [ OK ]
检查温度
54度还行,应该不会有烧坏的风险。
root@ckun-MS-2:~/pkgs# mget_temp -h
mget_temp - Get Hardware Temperature of Mellanox Technologies LTD devices
======================================================================================
Prints the current device temperature in degrees centigrade.
OPTIONS:
-h : print this help message
-d <dev> : mst device name
--version : display version info
root@ckun-MS-2:~/pkgs#
root@ckun-MS-2:~/pkgs# mget_temp -d 04:00.0
54
root@ckun-MS-2:~/pkgs#
root@ckun-MS-2:~/pkgs# mget_temp -d 04:00.1
54
root@ckun-MS-2:~/pkgs#
root@ckun-MS-2:~/pkgs#
再重启一下看看,网卡还在不在
root@ckun-MS-2:~# lspci | grep Eth
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 8125 (rev 04)
04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
root@ckun-MS-2:~#
全部搞定。
小结
其实这个mellanox的网卡应该是给服务器用的,没有独立的散热风扇,只有一个小小的散热片,依赖服务器的暴力风扇风道散热。
相比另一个intel的X710网卡就好很多了,只是温温的,散热片也大,被动散热都足够了,台式机和服务器都好用。
更多推荐
mellanox网卡驱动安装和温度检查
发布评论