jeudi 13 octobre 2011 (1 post)
The issue
This morning (while I was running late for an appointement) I
had a very weird stuff happening on my Thinkpad T61 laptop. Since I
recently offered myself a shiny Thinkpad x201s, I have to admit I
don't use much my T61 anymore. But this morning I had to print a
page (for this appointement) and, as I didn't yet configured my
printer on the x201s, I went to the T61. But I noticed that the
network was down. I've tried quickly on wireless but, bad luck, my
current wifi setup selects the channel automatically and it prefers
choosing channels which aren't available in the US. Guess what, my
T61 comes from the US and has those channels completely disabled,
so no wireless available either.
The investigation
I first tried to
modprobe -r e1000e
modprobe e1000e
to see if it fixed the problem, but it didn't. Worse, the
interface disappeared and never reappeared. I tried to reboot but
it didn't fix the problem, the link was still down. Running really
late, I put the file on a usb key and printed it from the powerbook
and postponed the fix for later.
Now, this evening, I tried to investigate a bit more. Symptoms
weren't only that the nic wasn't working, but there was a high load
on the system (1-2 at idle), unresponsiveness every second or so,
and watching top I could see spikes of high cpu usage for the
kworker kernel thread. Typing that on google you can find a lot of
people running on this issue, usually starting around kernel 2.6.36
or 2.6.37. Now, I might have upgraded the kernel recently to
3.0.0-4, but that didn't look related since the problem first
appeared when the laptop was up and running. And I tried to reboot
under 2.6.39, 2.6.38 and even 2.6.32 and the problem was still
present. Each time, unloading the module would fix the problem, but
loading it again wouldn't make the interface reappear. People
advised to boot with pcie_ports=compat but that didn't do anything.
I tried to boot without intel_iommu=force (disable Intel Vt-d) and
pcie_aspm (Active State Power Management) but nothing either.
Considering a userland issue, I've tried to boot a grml live distro (always keep a grml.iso in
your /boot, extlinux-update will even put it in your menu
automatically), and the problem was still present. So not a Debian
kernel issue, not a userland issue, only thing left was the laptop.
I didn't update the Bios recently, so I wondered exactly what could
be the problem. I started to feel a little bad, since I still
really like that laptop, and that I already decided to lend it to
my sister since her own T61 is sitting with a dead system board in
my shelf. I know she might have some negative waves, but she was
not even landed when the problem first appear.
The fix
Then I had a flash. It's not mystery that I'm used to break network
cards, and I had the bright idea to shutdown the laptop,
disconnect AC and battery, then let it idle a bit. I even tried the
secret Thinkpad power button code but I think it's unrelated.
Then I re-plugged the battery, booted to grml and the issue was
gone. I rebooted on the standard Debian and the link was up,
network was working.
So what happenned?
The (tentative) explanation
My guess is that, somehow, the network card firmware has an
issue and choked on something (a network frame or an attack exactly
like the one we demonstrated on ASF firmware). In fact, no, I don't
think it's the e1000e firmware. My T61 comes with Intel vPro, which
includes AMT (Active Management Technology), a remote management
solution a bit like ASF but more advanced. As far as I know, AMT
firmware always runs, even when it's disabled, it's just completely
idle. Idle, but in this case I think it choked on something, and a
reboot isn't enough to restart the AMT firmware. But a real hard
reset without any power seems to do the trick.
What next?
Well, a part of me is pretty scared, but another is just bored.
I mean, we know about that, that's exactly the kind of issue we are
warning people of. I have no idea what exactly happened, and
there's no way I'll be able to reproduce that, but I'm pretty sure
it's something lying at a pretty low level in the platform, and
which can severely disable your workstation. Now if it happens
again I won't lose too much time on this.
TL;DR: helping other people
In case you came here because you searched on google terms like
“kworker cpu usage”, e1000e, interrupts, it might be a good
idea to first reboot on a live CD to eliminate installation issues,
then shutdown the laptop, remove the battery and let it few seconds
idle. This might be enough to reset “something” inside and fix
the situation.
Corsac@22:44:53 (Debian)
mercredi 19 octobre 2011 (1 post)
I received recently a mail about my attempt to provide Grsecurity kernels in Debian. The
sender found the bug by accident, and asked me why I didn't do some
more publicity here. So here we are.
I won't go into details on what grsecurity is, it's fairly
complex. But it's basically a hardening patch for the Linux kernel,
with three main components:
- the PaX patch, which purpose is to harden the memory layout of
the Linux kernel and improve existing options: enforcing of
non-executable memory pages (userland and in kernel), W^X (no page
marked as writable and executable), ASLR, prevention of invalid
userland pointers dereference, copies between userland and kernel
memory…
- RBAC (Role Based Access Control), an implementation of
Mandatory Access Control
- various hardening features: /proc restrictions, chroot
restrictions, kernel symbols hiding etc.
A lot of this touches low level stuff in the kernel, especially
memory management. Ideally this patch would be pushed upstream, but
Brad Spengler (grsecurity main developper) already said he wasn't
interested in upstreaming it and upstream already said the patch
was too huge and invasive to include it like that (especially since
the original authors aren't interested in maintaining it upstream).
There's an ongoing effort to split the patch and merge things
little by little, but in the meanwhile having a mid-term solution
would be nice.
I know Debian users rebuilding grsecurity-patched kernels
themselves, and I know some of them would appreciate having them
included in the Debian kernel. Fortunately, the linux-2.6 source
package has a nice feature which is called featureset.
Basically it's a way to build some (binary) packages using a
different set of patches and a different config. For example this
was used to provide xen/openvz/vserver patchsets, and is now used
to provide rt kernels.
So I though it'd be nice to provide a grsec featureset, and
starting doing the work. I have a working setup for producing those
kernels, so I've opened a wishlist bug against the kernel (#605090) to have this
merged.
Those packages follow the sid kernel. There's an ongoing work
for Squeeze, but it's a bit harder there because both the
grsecurity patchset and the Debian kernel ship a whole lot of
backports to the Linux kernel, meaning the grsecurity patch doesn't
apply directly to the Debian source package. Basically I need to
remove some of the hunks (since they are already applied to the
source) and port some others (since there are some backported code
not present in the vanilla 2.6.32, for example the drm code).
Until the patches are merged and the bug is closed, I host some
of the built packages at:
deb
http://molly.corsac.net/~corsac/debian/kernel-grsec/packages/
sid/
The repository is signed by my key
which you can add to your apt setup using apt-key add. If you want
to rebuild the packages yourself, here's the method:
mkdir kernel-grsec
cd kernel-grsec
svn checkout
svn://svn.debian.org/svn/kernel/dists/sid/linux-2.6
git clone
git://anonscm.debian.org/users/corsac/grsec-patches.git
wget
http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.tar.bz2
wget
http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.tar.bz2.sign
gpg --verify linux-3.0.tar.bz2
cd linux-2.6
apt-get build-dep linux-2.6
export QUILT_PATCHES=../grsec-patches
quilt push -a
python debian/bin/genorig.py ../linux-3.0.tar.bz2
debian/rules orig
fakeroot debian/rules source
fakeroot make -f debian/rules.gen binary-arch_amd64_grsec_amd64
You could also do dpkg-buildpackage, pdebuild or whatever.
Kernel
handbook is a nice reading too if you want more information on
how to rebuild Debian kernels. The quilt push -a may fail if you
checkout an svn version more recent than mine. I try to keep
patches up to date but I usually have some delay.
Note that installing the kernel will require installing
linux-grsec-base package. Binary is not yet available on my mirror
but you can easily build it. Source can be found on
git.debian.org.
If you're interested by this, don't hesitate to mail me or the
bug.
Yves-Alexis@23:09:58 (Debian)
lundi 31 octobre 2011 (1 post)
C'est peut être parce que cette année Noémie est avec nous
(depuis plus de 72h), ou peut être parce que les américains sont
revenus et que pour eux Halloween c'était signe de balade dans les
rues pour voir la gay pride déguisée en zombies. Peut être parce
que cette année (hormis Eurodisney et le parc Astérix) y'a pas eu
de battage médiatique autour de ça.
Cette année j'ai bien aimé Halloween. On a croisé un zombie
ou deux dans la rue, qui avaient en dessous de diz ans, et on a
même eu trois petits monstres qui sont venus sonner à la porte en
fin d'après-midi pour avoir des bonbons ! Bon par contre on
n'était absolument pas préparés, du coup on n'avait que des
bonbons au coquelicot du mariage à leur offrir, mais on leur a pas
dit, c'était rouge, ça s'est pas vu, par contre ils auront peut
être une surprise en les goûtant, ils essayeront de deviner ce
que c'était, et pis comme ils étaient moyennement déguisé ça
leur apprendra.
À petite dose, ça me va (et le doodle du jour était mignon, si
je puis dire. (ils ont vraiment d'énormes
citrouilles là bas).
Corsac@20:27:54 (Echoes)