SysAdmin Blog | Alexander Bochmann

raspbian jessie - rsyslogd-2007: action 'action 17' suspended, next retry ...

Alexander Bochmann Monday 10 of April, 2017

On a headless Raspberry Pi running raspbian/jessie, the /var/log/messages file is filling up with entries like these:

 rsyslogd-2007: action 'action 17' suspended, next retry is [..date..] [ try http://www.rsyslog.com/e/2007 ]

It seems this message is generated when rsyslogd isn't able to deliver syslog messages to one of the destinations in rsyslog.conf

In the case a raspbian, it's obviously the entry at the end of the config that tries to pipe messages to |/dev/xconsole - which doesn't exist on a system that doesn't run X11...

The messages disappear after commenting out or deleting the corresponding lines:

/etc/rsyslog.conf

Copy to clipboard

#daemon.*;mail.*;\
#       news.err;\
#       *.=debug;*.=info;\
#       *.=notice;*.=warn       |/dev/xconsole

I really should file a bug report for this...

Splunk eval vs. variable names with dashes

Alexander Bochmann Wednesday 05 of April, 2017

I'm pretty certain I used to know this - but for the next time I'm putting this into a search engine and don't find it in the Splunk docs:

One of our data sources writes structured data into our Splunk installation which contains variable names with dashes - in this particular case, access-time

It's no problem using such a variable in a lot of Splunk operations, but it fails in an eval, as it will be interpreted as a mathematical operation (access minus time).

There's two options to work around that:

the one mentioned in the Splunk documentation: Put the variable name in single quotes, i.e. | eval newtime='access-time' - constant
the other one is to simply rename the variable before working on it: | rename access-time AS accesstime | eval newtime=accesstime - constant

downgrading Android apps using data from TWRP backups

Alexander Bochmann Tuesday 28 of March, 2017

Mostly as a reminder to myself when I'm looking to solve this kind of problem the next time: Since the March 22, 2017 version of the FortiClient VPN Android app kept crashing on my mobile (still running the last Cyanogenmod 13 snapshot) as soon as I tried to switch away to the launcher, I wanted to downgrade the app.

Unfortunately, there's no copy on apkmirror.com or F-Droid, and I don't know about any other reasonably trustworthy sources. I also already had removed and reinstalled the app, so recovering the old version on the phone didn't seem an option either.

Fortunately, I take TWRP backups now and then, so I tried looking at one of those. For once, having unencrypted backups turned out real convenient: A TWRP data.ext4.win file is just a tar.gz, so I was able recover the app/com.fortinet.forticlient_vpn-1/base.apk file (using 7Zip on Windows), and copy that over to my phone. After uninstalling the current version of the FortiClient app, I just reinstalled the program with the CM file manager using the restored base.apk as a source. Done.

Cisco ASA logging: Disable hiding of usernames in failed admin logins

Alexander Bochmann Thursday 23 of March, 2017

Cisco ASA firewalls don't log, by default, the username used in a failed administrator login. Instead, the login is masked out using "*" characters:

%ASA-6-113005: AAA user authentication Rejected : reason = AAA failure : server = 10.1.1.1 : user = ***** : user IP = 192.168.0.10

The rationale is that users sometimes enter their password instead of the username, and the password will then end up in logs. As we're using two-factor authentication for admin logins, that doesn't apply to us.

That behaviour was actually tracked as a bug in Cisco's bug database (cache), and while the article mentions that a command was introduced to change this behaviour, the command itself isn't mentioned.

After some fiddling on the ASA command line I found this statement:

no logging hide username

The corresponding button in the ASDM GUI is in Device Management -> Logging -> Syslog Setup: "Hide username if its validity cannot be determined"

so I didn't notice that my OpenBSD vserver had broken IPv6 for quite some time...

Alexander Bochmann Sunday 19 of February, 2017

...until I had a look at the DNS server log, which showed errors contacting other servers via IPv6.

The hoster I'm using has a somewhat strange IPv6 setup where you get a /64 for your system, but the default gateway is just fe80::1 - when I originally set up the system, I put that into /etc/mygate whithout thinking much about it.

This initially was ok for quite some time, but it seems the default route vanished at some point. (In retrospect I don't quite understand why the setup ever worked at all, as the lo0 lookback interface has fe80::1 auto-assigned too...)

Then I remembered that fe80:: carries interface tags, since it exists on any IPv6-enabled interface, and the OS needs some way to decide which fe80:: it has to deal with right now.

Edited /etc/mygate accordingly, and things are back to normal (vio is OpenBSD's VirtIO network device driver, so my virtual ethernet device is vio0):

fe80::1%vio0

Linux ATA bus errors with ASMedia ASM1062 PCIe card

Alexander Bochmann Saturday 22 of October, 2016

I recently added a cheap ASM1062 2-port SATA card to my Linux box at home, since it's Asus C8HM70-I board only has two SATA ports, and I wanted to use an additional small SSD as boot device.

With my disks hooked up to the new card, I started to get SATA errors when there was moderate write load:

kernel log

Copy to clipboard

ata5.00: exception Emask 0x10 SAct 0x7c000000 SErr 0x400000 action 0x6 frozen
ata5.00: irq_stat 0x08000000, interface fatal error
ata5: SError: { Handshk }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/00:d0:00:2b:6f/0a:00:ac:00:00/40 tag 26 ncq 1310720 out
         res 40/00:f4:00:53:6f/00:00:ac:00:00/40 Emask 0x10 (ATA bus error)
ata5.00: status: { DRDY }
ata5.00: failed command: WRITE FPDMA QUEUED
ata5.00: cmd 61/00:d8:00:35:6f/0a:00:ac:00:00/40 tag 27 ncq 1310720 out
         res 40/00:f4:00:53:6f/00:00:ac:00:00/40 Emask 0x10 (ATA bus error)
[..]
ata5: hard resetting link
ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata5.00: configured for UDMA/133
ata5: EH complete

I'm not yet ready to blame the card itself, since I remembered I recycled a pair of rather old SATA cables to connect the drives, and the card supports SATA 6G... The mainboard itself has just one SATA 6G connector, and with that I used different cables that clip into the port, but the clip mechanic doesn't work with the connectors on the ASMedia card.

For now, I turned the SATA link speed down to 3G by adding an libata.force parameter to the kernel command line:

libata.force=5:3.0G,6:3.0G

(5 and 6 are corresponding to ata5 and ata6 from the libata kernel messages.)

This seems to work as a stopgap measure - the bus errors haven't reappeared since.

Before:

ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

With libata.force:

ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 320)

syslog-ng and RcvbufErrors on Linux

Alexander Bochmann Tuesday 10 of May, 2016

We're running a syslog-ng installation to collect syslog data from quite a lot of systems (and then selectively feed them into our Splunk installation). Almost all of these send syslog via UDP.

Recently, when adding a couple more machines, I noticed that the syslog server is dropping UDP datagrams:

udp RcvbufErrors

Copy to clipboard

# netstat -su | grep -A6 "^Udp:"
Udp:
    518026364 packets received
    36078 packets to unknown port received.
    23164168 packet receive errors
    1248583 packets sent
    RcvbufErrors: 23164167
UdpLite:

Yikes!

This is mentioned in the syslog-ng OSE docs, but it seems no one here ever got to that section, including myself.

So, in that context I learned about the so-rcvbuf() parameter to the udp() source in syslog-ng, and the Linux kernel net.core.rmem_max sysctl...

Kernel configuration

Copy to clipboard

# sysctl -w net.core.rmem_max=16777216

(add the same parameter to /etc/sysctl.conf)

syslog-ng.conf

Copy to clipboard

source s_net {  
                udp(ip(0.0.0.0) port(514) so-rcvbuf(8388608)); 
};

(There's no reason why so-rcvbuf() couldn't be the same as rmem_max, and neither needs to be a multiple of 1024 - both just bad habits of mine...)

Don't increase net.core.rmem_default, as that would make the Linux kernel use a bigger buffer for every UDP socket being created on the system.

The RcvbufErrors counter hasn't been increasing since that change, but I'll add monitoring for that, so drops won't go unnoticed in the future.

killing your network with Cisco ASA 9.x identity NAT and proxy arp

Alexander Bochmann Sunday 17 of April, 2016

I was about to prepare a longer blog post on one of the pitfalls when migrating the NAT ruleset of an older Cisco ASA to a 9.x release - but as it turns out, the problem is already documented pretty well by Cisco, if you know what to look for...

With "Twice NAT", as implemented in 9.x software versions, an ASA firewall in routed mode will automatically do proxy ARP for all addresses covered by a NAT rule, to attract traffic for them. This is usually an intended effect, unless you're configuring Identity NAT rules (used to inhibit address translation for certain source/destination pairs) that cover address space locally connected to the firewall. This was not a problem with NAT exempt rules on older ASA software, but if such a rule is used now without the no-proxy-arp parameter, the ASA will act as a blackhole for traffic on on the local network segment, by sending proxy-ARP replies for addresses it doesn't own.

In Proxy ARP Problems with Identity NAT (cache), Cisco illustrates the problem with this diagram:

image copied from vendor documentation, (c) Cisco

Yeah, don't do that. Always consider whether no-proxy-arp is required for a NAT rule before it's being deployed.

(Also see ASA FAQ: Why does the ASA reply to ARP requests for other IP addresses in the subnet? (cache).)

Cyanogenmod 12.1 device encryption fails after wiping filesystems with TWRP

Alexander Bochmann Wednesday 25 of November, 2015

I recently bought a 2nd hand Android mobile (Samsung) to install Cyanogenmod on. The process is quite straightforward from the documentation on the CM website. I installed TWRP using Heimdall and wiped the system partitions from the recovery before installing CM 12.1.

Once running Cyanogenmod, I wasn't able to activate device encryption though. Unsuccessfully tried several of the tips out there, like disabling Selinux before starting the encryption process. After retrying with an active adb logcat, I found this message in the log:

Copy to clipboard

E/Cryptfs (  183): Orig filesystem overlaps crypto footer region.  Cannot encrypt in place.

...which in turn lead me to this thread on the Cyanogenmod forums (cache). The hint to resize the data partition is correct, but it's not actually required to reformat the filesystem, as Android comes with a resize2fs. So I booted into TWRP recovery and connected to the system via adb shell. Turns out that /data is mounted on /dev/block/mmcblk0p24:

Copy to clipboard

# df
[..]
/dev/block/mmcblk0p24
                       5584700    931020   4653680  17% /data
/dev/block/mmcblk0p24
                       5584700    931020   4653680  17% /sdcard

After unmounting /data and /sdcard, I had a quick look at the partition with tune2fs:

Copy to clipboard

# tune2fs -l /dev/block/mmcblk0p24
tune2fs 1.42.9 (28-Dec-2013)
Filesystem volume name:   
Last mounted on:          /data
Filesystem UUID:          17e3f4bc-acf2-631e-af53-921ea0c9e21a
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode filetype extent sparse_super large_file uninit_bg
Filesystem flags:         unsigned_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Remount read-only
Filesystem OS type:       Linux
Inode count:              355520
Block count:              1421307
Reserved block count:     0
Free blocks:              1163420
Free inodes:              353516
First block:              0
Block size:               4096
Fragment size:            4096
[..]

So, 1421307 blocks of 4096 bytes. Since the forum thread was not quite clear on how much space is required to facilitate encryption, I decided to shrink the filesystem by 8 blocks (32k):

Copy to clipboard

# e2fsck -fy /dev/block/mmcblk0p24 
# resize2fs /dev/block/mmcblk0p24 1421299

...rebooted into CM, and successfully activated system encryption without further problems.

not a good idea: running Windows 10 update on a machine with Truecrypt system encryption

Alexander Bochmann Monday 17 of August, 2015

Yeah, that's not much of a surprise.

Details on Google+

Blog Actions