4 November 2009 - 2:10Flash and the storage hierarchy (again)

Earlier this year in a post titled “Disk is the new disk”, I ruminated a bit on the implications of the changing storage hierarchy. One of the frequently discussed changes on the horizon is the introduction of flash memory (or other solid state, non-volatile memories like phase-change memory, which I mentioned in that post) into the storage hierarchy. One question I didn’t address at the time is, “do these solid state non-volatile memories belong as external or internal memory?” That is to say, “do these belong on the memory bus or IO bus?” With flash memory, the latencies and writing semantics (i.e., individual bytes can’t be arbitrarily modified; entire blocks are erased at once) tend to naturally place flash memory devices on the IO bus as external memory. However, other technologies may be more appropriate to expose as directly CPU-addressable memory. A recent paper at SOSP ’09 explores this in the context of Phase-Change Memory (although the techniques are applicable to any directly-addressable BPRAM — byte-addressable persistent memory): “Better I/O Through Byte-Addressable, Persistent Memory”. Unlike flash, PCM can be read and written in a byte-addressable manner, and access times are fast enough to merit putting PCM on the CPU’s memory bus.

The paper introduces BPFS, a filesystem for BPRAM which exploits various properties of directly-addressable non-volatile memory to improve performance and durability over traditional filesystems. It’s interesting how a lot of the very thorny problems of filesystem consistency associated with block-based disk interfaces are solved elegantly by applying traditional in-memory style atomic updates to data structures in BPRAM. Traditional filesystems use relatively hairy techniques like soft-updates or journaling to try to ensure that persistent data structures are updated in a way that preserves metadata consistency even if the system fails in the middle of a write. With byte-modifiable data structures, you can use atomic update techniques similar to those used in lock-free data structures (e.g. do modifications on a private copy of data and then atomically “publish” it by setting a pointer field pointing to the private copy — and make sure to put architecture-appropriate fences so reordering doesn’t bite you!*). When I started reading the paper, two classic systems paper quickly came to mind: “Lightweight Recoverable Virtual Memory” and “Free Transactions with Rio Vista”. Rio Vista owes a lot to RVM, but one of the memorable things about Rio Vista is that it uses battery-backed RAM to perform atomic and durable transactions on persistent data structures. Like BPRAM, the persistent but directly-modifiable nature of battery-backed DRAM really simplifies many aspects of the system. Naturally, when I got to the end of the paper, I found that they cited Rio Vista in their related work. Anyway, it’s good to see that one of the co-authors is a former Georgia Tech classmate, Derrick Coetzee.

* One of the most interesting parts of the paper is their “epoch” mechanism — while fences work fine for volatile data structures, they don’t provide strong enough semantics when you have persistent memory. Memory fences aren’t strong enough because they just affect a CPU’s view of memory, not the actual contents of DRAM. A CPU’s view of memory is basically DRAM plus “diffs” of more recent data at various caching levels. With BPRAM, if the power goes out, the diffs in cache disappear and then the persistent data left may not be consistent by itself. You have to make stronger guarantees about when persistent data gets written to maintain proper stored data structure consistency.

Linux and flash
Although the SOSP paper is recent and on my research radar, it is not what prompted this post. The original impetus came from my recent viewing of the Linuxcon 2009 roundtable discussion. This roundtable gained some Slashdot notoriety because Linus made a comment about the kernel being “huge and bloated,” due to its expanding feature set and icache footprint. During the Q&A session, someone asked a question regarding flash RAM. The question was predicated on the assumption that flash will transition to directly addressable (internal) memory and was asking whether, since flash cells have limited lifetimes, would Linux eventually integrate code to deal with directly accessible memory failing. Ted Ts’o sort of dismissed the question and said that he believed that the right place to address failure properties are in the hardware (and he didn’t necessarily agree that flash would move to directly addressable memory). Currently, flash exposed with a “disk drive” interface — i.e., flash that looks like a fast hard drive — handles failure and wear leveling underneath the storage interface.

This reminded me, however, that there is another class of flash support in Linux that is commonly misunderstood. Linux has a MTD (Memory Technology Devices) subsystem which supports “bare flash” devices. These devices are more common on embedded systems, and basically “bare flash” is exposed flash memory that doesn’t look/act like a standard hard drive. The software above gets to access the real flash blocks and has to handle wear leveling, dealing with bad blocks and also dealing with write/erase semantics (things that the firmware would do in a hard drive-like flash disk). MTD devices don’t act like block devices and actually expose three operations: read, write, and erase. There’s a whole class of Linux filesystems built to run on top of these “bare flash” devices: YAFFS, JFFS2, LogFS, and UBIFS, and they’ve also factored out some of the common functionality used in a lot of these filesystems into a separate UBI (Unsorted Block Images) layer. But I see a lot of misunderstanding of the point of these filesystems. Many casual Linux users or observers think that these filesystems are flash-optimized regular filesystems to be used on top of hard-drive like flash devices (or CF/SD/etc.). Anyway, I see this mistake made a lot on forums and such so it came to mind after the Linux roundtable question.

No Comments | Tags: Linux, Research Content

28 October 2009 - 23:57SSH tips and tricks

Today, I gave an “SSH tips and tricks” presentation for our local LUG. Practically every Linux hobbyist knows about basic ssh, scp (and sftp), but OpenSSH has a lot of rich, hidden functionality — like a SOCKS proxy, connection sharing and even layer-2 and -3 VPN functionality. Also, there’s a lot of little useful bits of functionality like ssh-keygen -R, selectable ciphers and ssh -t, which people are often unaware of. So the goal of my presentation was to go over some lesser known (or more advanced) useful features of OpenSSH. I’m posting the presentation here since I post Linux-related stuff on my blog.

This presentation is updated from “SSH Tips and Tricks given on Wed. Feb 28th, 2007 by Benjamin McMillan and David Hilley.

New things compared to last time:

  • ssh -o StrictHostKeyChecking=no
  • ssh-keygen -R & HashKnownHosts
  • ssh Restricted Shells (rssh, scponly, etc.)
  • ssh keychain
  • ssh pseudo TTY allocation
  • forwarding bind addresses
  • PAM & ssh
  • rsync & ssh ciphers
  • compression
  • parallel ssh tools
  • fuse sshfs

In this presentation, I will skip the “using ssh” basics.

SSH Config File

The ssh config file isn’t particularly advanced, but many of the later tips benefit from config file customization. Typing your username for every host is a pain. In addition, ssh has a large variety of special options (e.g., compression, agent forwarding, host-specific private keys, etc.) that may differ on a per-host basis. In your ~/.ssh directory, create a file named config to set host aliases and options. Example ~/.ssh/config:


Host feynman
  User foobar
  ForwardAgent yes 

Host hawking 192.168.1.1 router
  Hostname hawking
  User root
  ForwardAgent yes
  Port 222 

Host *.cc.gatech.edu
  User bmcm
  ForwardAgent yes
  Compression yes

Run man ssh_config to view documentation about ssh config files.

Change ssh cipher for rsync

Ever rsync over ssh to CPU-limited devices (e.g., VIA CPUs, ARM, Atom, contended shared servers, etc.)?


$> rsync -e 'ssh -c blowfish' -P -a -v files remotehost:path


  • arcfour and blowfish are typically “cheaper” than 3DES or AES; best one depends on CPU/architecture
  • “none” is a SSHv1 only option; don’t use with password across a public network — your password will be sent in plain text (keys are okay)
  • Note: If you use connection sharing and already have a connection open, the cipher request will be ignored.


SSH Compression

ssh features built-in compression:


$> ssh -C user@hostname


This is particularly helpful for X forwarding over WAN connections or even multi-segment LANs (combine with -Y for trusted X11 forwarding). It can make the difference between unusable and tolerable forwarding.

SSH Pseudo TTY Allocation

ssh -t forces pseudo-tty allocation. By default, when you ssh without a command to execute (just to log in), a pseudo tty is allocated. When you specify a command to execute, ssh does not allocate a pty. Forcing it is necessary if you want to run something that’s not just plain text output, like screen or top.


$> ssh -t user@host1 htop


Why wouldn’t you want a pseudo-tty? Well, consider piping binary data:


$> ssh user@host1 "cat file" | diff same_file_copy -


You don’t want a pseudo tty in that case. It will molest your data by interpreting escapes and such (unless you want to over-complicate things with uuencode).

SSH Host Keys

When you connect to a new host via ssh, the host key is stored in ~/.ssh/known_hosts, establishing the host’s identity.

StrictHostKeyChecking option

When you first connect to an unknown host via ssh, it asks you to confirm:


$> ssh star2.cc.gt.atl.ga.us
The authenticity of host 'star2.cc.gt.atl.ga.us (143.215.129.169)' can't be established.
RSA key fingerprint is fe:82:da:4d:76:f7:fa:b4:40:6f:7d:3e:1b:b3:01:bb.
Are you sure you want to continue connecting (yes/no)?



This explicit confirmation is good for security, but sometimes you want to script commands via ssh and don’t want to be prompted. Personal example: 52 node cluster without shared home directories; do it the naive way, and you’d need ~(N^2) different confirmations. Even if you do it on one host and then copy the known_hosts file to the other nodes, it is annoying to go through all of the prompts. Solution:


$> ssh star2.cc.gt.atl.ga.us -o StrictHostKeyChecking=no
Warning: Permanently added 'star2.cc.gt.atl.ga.us,143.215.129.169' (RSA) to the list of known hosts.



On a cluster, you could do something like this to pre-seed the files:


$> seq 1 52 | xargs -n1 -I@ ssh -o StrictHostKeyChecking=no rohan@.cc.gatech.edu uptime


HashKnownHosts option

Default in many cases; hashes hostnames in ~/.ssh/known_hosts. Entries look like the following (top is hashing on, bottom is hashing off):


|1|QxEMrKqPTNuBtHIEYSbztjaOkF8=|Y1387YZhibtug9rr4ZVXenyRXb4= ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAg...

boar3.almaden.ibm.com,9.1.112.221 ssh-rsa sAAAAB3NzaC1yc2EAAAABIwAAAg...

Some people prefer to disable by setting HashKnownHosts no in ~/.ssh/config. Hashing breaks tab-completion based on ~/.ssh/known_hosts (but not via other means — like parsing ~/.ssh/config — so it may not be a big deal).

Want to convert an old, non-hashed ~/.ssh/known_hosts file to hashed? Run ssh-keygen -H

ssh-keygen -R & -F

When a host’s key changes, ssh will make you remove the offending key from ~/.ssh/known_hosts before it will let you connect (unless you force it to ignore). Some people edit this file by hand (why?), but ssh-keygen has specific options for this:

  • ssh-keygen -R host — removes the entry for given host
  • ssh-keygen -F host — finds entries for a host

This works whether or not ~/.ssh/known_hosts is hashed, so don’t waste your time editing things by hand.

View a host’s key fingerprint?

Use ssh-keygen: ssh-keygen -l -f /etc/ssh/ssh_host_rsa_key.pub (or dsa). Add -v to get an ASCII art depiction:


$> ssh-keygen -v -l -f /etc/ssh/ssh_host_rsa_key.pub
+--[ RSA 2048]----+
|    oo  ..       |
|   . ..o. .      |
|    . o. +       |
|     . .= .      |
|      .oS+       |
|      ..oo..     |
|         o+ o    |
|          E+ .   |
|           =o    |
+-----------------+


SSH Keys

ssh keys allow you to log in to remote machines with a public/private key pair. In your ~/.ssh directory you will use ssh-keygen to generate a key pair, and you will have to distribute the public half of the key pair to remote machines. You keep the private half of the key pair private (as if that isn’t obvious).

Passphrases

A passphrase is a password that unlocks the generated key. It may be blank, but we suggest only making it blank if you have other security measures (one of those is detailed below). A stolen but password-protected private key will not compromise your accounts.

Generating a key


$> ssh-keygen -t rsa
Enter file in which to save the key (/home/user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase): (enter passphrase)
Enter same passphrase again: (enter passphrase)
Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.
The key fingerprint is:
2c:3f:a4:be:46:23:47:19:f7:dc:74:9b:69:24:4a:44 user@mydesktop


Add -b to change the number of bits and -C to add a descriptive comment.

Distributing the key


$> cat .ssh/id_rsa.pub | ssh user@server "mkdir -p ~/.ssh/ && cat - >> ~/.ssh/authorized_keys"


The above command appends your public key to the list of authorized keys on a given host. Now when you login, instead of asking you for a password, ssh will ask you for the key’s passphrase (or it will just let you in, if your passphrase is blank).

Note: Make sure ~/.ssh/authorized_keys, ~/.ssh/ and ~/ do not have go+w perms or sshd will ignore your public keys. “StrictModes no” in sshd_config will disable this check, but not a good idea. Think about it: if ~/.ssh/authorized_keys is writable by another user, they can throw in a newly generated public key and log in as you. If ~/.ssh is writable, they can rename or delete ~/.ssh/authorized_keys and make a new one with their key. If ~/ is writable, they can rename the whole ~/.ssh directory and make a new one to use (and so forth).

Finer-grained Key Control

You can limit the power of keys to run certain commands, to only connect from certain hosts, and disallow things like port forwarding, X forwarding, etc.

Concrete example:
I use a restricted ssh key to run fetchmail. My mailhome is baobab.cc.gatech.edu, and I’ve set up a special entry in my ~/.ssh/config for email checking:


Host email
Hostname baobab.cc.gatech.edu
User davidhi
IdentityFile ~/.ssh/mail_check
LogLevel QUIET



The mail_check key is a special password-less key. In my ~/.ssh/authorized_keys file, I have the following prepended before the public key:


command="/usr/sbin/imapd",no-X11-forwarding,no-port-forwarding, no-agent-forwarding,no-pty ssh-dss A...


This restricts connections made via the key to only running imapd and disallows X11 forwarding, port forwarding, agent forwarding and pty access. This means that the password-less key is locked down (assuming imapd is not compromised) [0].

List of things:

  • no-agent-forwarding
  • no-port-forwarding
  • no-pty
  • no-X11-forwarding
  • command="command"
  • environment="NAME=value"
  • from="pattern-list"
  • permitopen="host:port"
  • tunnel="n"

Notes:

  • ‘command’ sets a command to run. Adding no-pty makes the command 8-bit clean for two way data transfer. In command, one can use $SSH_ORIGINAL_COMMAND to refer to the command the user wants to execute. Using the shell expression “${SSH_ORIGINAL_COMMAND:-}” handles the case in which no command is specified gracefully [1].
  • ‘environment’ modifies the environment variables.
  • ‘from’ restricts based on the remote host (see ssh patterns).
  • ‘permitopen’ allows -L forwarding for specified host/port combinations only.
  • ‘tunnel’ sets up VPN tunnels.

[0] Not sure of the original source of this since I’ve been using it since at least 2003. This might be it, though: http://mah.everybody.org/docs/mail/fetchmail_check
[1] Hint from: http://www.oreilly.com/catalog/sshtdg/chapter/ch11.html

SSH Agent & Single Sign-on

Wouldn’t it be convenient to enter a key’s passphrase only once, and have it unlock your key for the entire terminal session? ssh-agent is a background service that keeps track of your unlocked keys. It can manage multiple keys, and you can add or remove keys or identities dynamically. When you use ssh to login to a server with key authentication, ssh will ask your agent if it already has the key unlocked. That way, you don’t have to type your password every time.

Starting the agent

Sometimes your distribution’s X server will start up the agent automatically. If it doesn’t, you can either start it manually each time, or add it to an environment appropriate startup script. For Gnome, you might put this in /etc/X11/gdm/PreSession/Default.


$> eval `ssh-agent -s`


The above command will start up the agent, and set some environment variables which ssh will later use to connect to this agent.

Note: the above command is for Bourne-style shells. Use ssh-agent -c for csh-like variants.

Adding an identity to the agent

Before things will start working, you must add your key(s) to the agent. You can do this automatically when you login to Gnome by adding it to your Gnome session (Startup Programs). There are several X11-based ssh-agent password prompting programs (ssh-askpass, ksshaskpass, etc.).


$> ssh-add


By default, the above command will try to add ~/.ssh/id_rsa and ~/.ssh/id_dsa, which are the most popular key filenames. If you have a different filename, just provide the name of the key as a command-line argument to ssh-add. If your passphrase is not blank, it will prompt you for the passphrase, but this will be the only time you have to enter it (until the agent stops or you manually remove the identity from the agent). Confirm that the agent has the key by running:


$> ssh-add -l


Agent connections can also be forwarded automatically via ssh to achieve single sign-on.

Keychain

Keychain is a front-end for ssh-agent allowing easy, system-wide sharing of ssh-agent (rather than per-login). Add something like this to your profile / bash_profile / preferred login script:


### START-Keychain ###
# Let re-use ssh-agent and/or gpg-agent between logins

/usr/bin/keychain $HOME/.ssh/id_dsa
source $HOME/.keychain/$HOSTNAME-sh
### End-Keychain ###


If set up properly, you’ll only need to enter your key password once per boot.

PAM SSH

Single sign-on via PAM. Add pam-ssh to XDM’s pam hooks (or GDM or whatever) and your X login will also unlock your private key via ssh-agent.

Port Forwarding

Tunnel traffic through your ssh connection to access non-public / local networks.

Local Forwarding: -L

Say you want to access services local to a private network (not publicly exposed), but from outside you can only ssh to a host on that network. For example, you want to access an internal website running on webhost. You can’t access it from a web browser at home normally, but ssh local port forwarding will let you tunnel the request through the ssh connection to it:


$> ssh -L 80:webhost:80 sshhost


This will bind the local port 80 (the first 80) to a tunnel, with the other end pointing to port 80 (the 2nd 80) of the webhost. You may then access the website by pointing your browser to http://localhost (because localhost port 80 forwards to webhost port 80). Port forwarding can be used to access any TCP service, not just websites (see below for an email example). In fact, dynamic forwards (see below) are often better for websites.

Notes:

  • Ports = 1024. For example: ssh -L 8080:webhost:80 sshhost will bind local port 8080 to tunnel to webhost port 80 through the ssh connection to sshhost. Then you’d just point your browser at http://localhost:8080
  • With many websites, virtual hosts are used and the tunneled request won’t have the right virtual host name (i.e., if you point your browser at http://localhost, it’ll use localhost as the virtual hostname rather than webhost). You can fix that by editing /etc/hosts and creating a temporary mapping, but dynamic forwards (see below) are often more convenient.

Here is an example for accessing pop3 via a tunnel. The command below sets up the forwarding and then leaves the connection open in the background without a shell.


$> ssh -L 9999:mailserver:110 shellserver -N -f


After that, you can setup your mail app to use localhost:9999 instead.
Also, as illustrated in the last example, you may use -N -f:

  • -N tells the client not to execute anything remotely
  • -f tells ssh to background

Remote Forwarding: -R

Local forwarding lets you tunnel requests from a local port through the remote host you’ve ssh’d in to. What if you want to go in the other direction — bind a port on the host you’re ssh’d in to and send it back to the host you ssh’d from? Why would you want to do this? Say you have a desktop at work, and there’s no way for you to ssh directly to your work desktop because it doesn’t have a public IP (or incoming ports are firewalled off). ssh from your work desktop back to your home network and use a remote forward. Here is an example ~/.ssh/config:


Host home-with-tunnel
Hostname homemachine.com
RemoteForward 2222:localhost:22
User joe


Now, when you ssh to ‘home-with-tunnel’, it will set up a remote forward from port 2222 on your home machine to your work desktop’s port 22. At home, ssh to localhost over port 2222 (using the -p option) and the request will be forwarded through the existing ssh connection, granting you access to an otherwise unreachable host. Note that this example also illustrates how to set up forwards using your ~/.ssh/config file. Sometimes setting up a remote forward can get confusing, but the localhost in the RemoteForward entry above refers to the machine you are ssh-ing from.

Note: the above scenario establishing potentially unauthorized outside access to a work machine may be against the Network Usage Policies of your place of employment. Use with caution.

Dynamic Forwarding (SOCKS Proxy): -D

Use -D to create a SOCKS proxy.


$> ssh -D 8080 helsinki.cc.gatech.edu


Now set your web browser to use localhost:8080 as a SOCKS 5 proxy. For Firefox, set network.proxy.socks_remote_dns = true (in about:config) and DNS resolving will occur via the proxy, too. That means you can use resolve internal network DNS hostnames. SOCKS proxies are very flexible. Instead of forwarding a specific remote TCP port to a local TCP port, -D runs a flexible proxy server which can be used to tunnel arbitrary requests with the right support. Also check out the very useful tsocks wrapper tool.

Bind Addresses

All flavors of forwarding — local, remote and dynamic — can optionally take “bind addresses” so you can expose forwarded ports externally. Normally forwarded ports are only listening on loopback interfaces, but there may be a legitimate reason to publicly expose a port forward. Example:


$> ssh -L 2222:localhost:22 localhost  # establish a port 2222 forward to my own port 22
$> ssh -p 2222 `resolveip -s localhost`
The authenticity of host '[127.0.0.1]:2222 ([127.0.0.1]:2222)' can't be established.
...
$> ssh -p 2222 `resolveip -s $HOSTNAME`
ssh: connect to host 143.215.128.82 port 2222: Connection refused

$> ssh -L '*:2222:localhost:22' localhost
$> ssh -p 2222 `resolveip -s $HOSTNAME`
The authenticity of host '[143.215.128.82]:2222 ([143.215.128.82]:2222)' can't be established.

The * bind address tells ssh to listen on all interfaces. Remember to protect the * from shell expansion with quotes! You could also put a specific hostname or IP there.

Connection Sharing (ControlMaster)

Use ControlMaster auto to speed up remote filename tab completion, sshfs, or other ssh operations to the same host. New connections to a host will use an already established session (if present), which makes operations a lot faster.

Add to ~/.ssh/config:


Host *
ControlMaster auto
ControlPath ~/.ssh/.sock_%r@%h:%p



Add to ~/.ssh/config:


$> ssh -Nf helsinki.cc.gatech.edu
$> ls ~/.ssh -la | grep sock
.sock_davidhi@helsinki.cc.gatech.edu:22



Parallel SSH Tools

Good for managing clusters of identical machines (or just all of your own internal systems).


$> parallel-ssh -l root -h hosts.txt -i -- uptime
[1] 18:45:25 [SUCCESS] capricorn.cc.gt.atl.ga.us 22
 18:45:25 up 1 day, 4:20, 0 users, load average: 0.00, 0.00, 0.00
[2] 18:45:25 [SUCCESS] leo.cc.gt.atl.ga.us 22
 18:45:25 up 1 day, 4:17, 0 users, load average: 0.00, 0.00, 0.00
[3] 18:45:25 [SUCCESS] scorpio.cc.gt.atl.ga.us 22
 18:45:25 up 1 day, 4:13, 0 users, load average: 0.00, 0.00, 0.00
[4] 18:45:25 [SUCCESS] libra.cc.gt.atl.ga.us 22
 18:45:25 up 1 day, 4:12, 0 users, load average: 0.00, 0.00, 0.00


Restricted SSH Shells

Sometimes you want users to be able to ssh to a machine but not perform arbitrary shell functions. A system administrator could rely on restricted keys to do certain things, but restricted shells offer more options.

ChrootDirectory option

Relatively recent addition to OpenSSH:


This commit adds a chroot(2) facility to sshd, controlled by a new sshd_config(5) option "ChrootDirectory". This can be used to "jail" users into a limited view of the filesystem, such as their home directory, rather than letting them see the full filesystem.

rssh

Restricted shell for sftp, scp, rsync and cvs. Doesn’t support unison or svn.

scponly

Restricted shell for sftp, scp, rsync, unison and svn. Doesn’t support cvs.

SSH Escape Sequences

Escape sequences are only recognized after a newline and are initiated with a tilde (~) unless you modify it with the -e flag. Hit ~? on a running ssh session to see a list of escapes:


Supported escape sequences:
~. - terminate connection
~B - send a BREAK to the remote system
~C - open a command line
~R - Request rekey (SSH protocol 2 only)
~^Z - suspend ssh
~# - list forwarded connections
~& - background ssh (when waiting for connections to terminate)
~? - this message
~~ - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.) 

~. and ~# are particularly useful.

VPN Tunneling

Did you know that ssh can do layer 2 and 3 VPN tunneling? Check out ssh -w. Example from manpage:


$> ssh -f -w 0:1 192.168.1.15 true
$> ifconfig tun0 10.0.50.1 10.0.99.1 netmask 255.255.255.252



FUSE SSHfs

Expose a remote host’s files like an NFS mount but via ssh:


$> sshfs davidhi@killerbee2.cc.gatech.edu:/net/hu17/davidhi ~/cc_home -oCiphers=arcfour



That’s all folks. Thanks!

2 Comments | Tags: Linux

30 September 2009 - 15:21CentOS / RHEL repo madness

Our school’s local LUG had an InstallFest this past weekend. I encountered and solved a problem with RHEL/CentOS and extra repositories with conflicting dependencies for media players like vlc and mplayer which seems to be rather common, so I decided to document it here.

I’m a Debian user, so I’m used to having a very large set of packages in the base repositories. On a typical Debian desktop system of mine, I may have the base repos with main, contrib and non-free plus debian-multimedia.org (and backports.org and volatile.debian.org on servers running stable). On RHEL, the set of base packages is much smaller; CentOS (plus the CentOS plus) has a little bit more but basically the same issues. So people tend to add a bunch of other repositories like the EPEL, rpmforge, RPM Fusion, or individual repos included in those umbrellas like Dries, DAG, Livna, etc. With the proliferation of repositories comes duplicated effort and problems with mutual compatibility.

For example, consider loading a RHEL 5.4 system from scratch and adding the EPEL and rpmforge repos:

  • Install EPEL: rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm
  • Install rpmforge: rpm -Uvh http://packages.sw.be/rpmforge-release/rpmforge-release-0.3.6-1.el5.rf.`uname -m`.rpm [0]

On CentOS I would now add yum-priorities (yum install yum-priorities) and set relative priorities for the different repos. But either way, let’s say you now try to install vlc with yum install vlc and it explodes with the following:

vlc-0.9.9a-3.el5.rf.x86_64 from rpmforge has depsolving problems
  --> Missing Dependency: libcucul.so.0()(64bit) is needed by package vlc-0.9.9a-3.el5.rf.x86_64 (rpmforge)
vlc-0.9.9a-3.el5.rf.x86_64 from rpmforge has depsolving problems
  --> Missing Dependency: libdvdread.so.3()(64bit) is needed by package vlc-0.9.9a-3.el5.rf.x86_64 (rpmforge)
Error: Missing Dependency: libdvdread.so.3()(64bit) is needed by package vlc-0.9.9a-3.el5.rf.x86_64 (rpmforge)
Error: Missing Dependency: libcucul.so.0()(64bit) is needed by package vlc-0.9.9a-3.el5.rf.x86_64 (rpmforge)

Why is this happening? Well, for the libcucul.so.0 dependency, the EPEL packages libcaca, and so does rpmforge, but they are not entirely substitutable (there’s a similar problem with libdvdread). Usually I trust the EPEL more (and give it a better priority) since it is more “official” and Redhat-endorsed (and has high quality packaging standards), but in this case we need to satisfy all of the dependencies from rpmforge. So, I use the following command:
yum --disablerepo='epel' install vlc

For the system at the InstallFest, it was even more complex because it had the EPEL, rpmforge and RPM Fusion and some packages from each. I had to disable all but rpmforge and install libcaca and caca-utils and then disable RPM Fusion to properly install vlc (and first I had to remove the libcaca package that was pulled in from the EPEL). yum was developed in part to resolve “rpm dependency hell”, but these warring packaging factions are causing the same problems to be exposed again to the user through yum, which is frustrating.

[0] Right now packages.sw.be seems to be broken in some places so you can grab from say http://mirror.cpsc.ucalgary.ca/mirror/dag/redhat/el5/en/`uname -m`/rpmforge/RPMS/rpmforge-release-0.3.6-1.el5.rf.`uname -m`.rpm

8 Comments | Tags: Linux

28 September 2009 - 14:39Supermicro X8DT3′s onboard LSI controller in Linux

So we got a new storage box in our lab with a Supermicro X8DT3 motherboard and a bunch of 15k SAS drives. The motherboard comes with an integrated LSI 1068E SAS controller and lspci identifies it as an “LSI Logic / Symbios Logic MegaRAID SAS 8208ELP/8208ELP.” The primary user of the system had a bit of trouble getting the damn thing to work in Linux, though, so I investigated the issue.

So apparently there’s a few different drivers that could support the card, and they all have their quirks. Some people have had success with the open-source and in-kernel megaraid driver, and LSI provides some 1068E Linux drivers on their site. Their drivers are the semi-closed mptsas drivers, but they do provide the source and dkms support, so they can be recompiled against different kernels (provided they aren’t changed to the point where stuff breaks).

Our problem was that the mptsas driver was compiling and loading fine but we weren’t seeing any disks. It turns out the integrated X8DT3 has two modes controlled by a hardware jumper: SR mode (Software RAID Mode) and IT (Integrated Target Mode). SR mode is the default, and that is where the card exports logical arrays to the host OS, but the mptsas driver doesn’t seem to work in that mode. In IT mode, the card just exports individual drives, and the mptsas driver works fine. However, the primary user didn’t want individual drives, he wanted the logically configured arrays (yes, I’m aware that Linux md/software RAID could have probably done just as good a job as the LSI controller’s SR mode, but I wanted to see if I could get it to work in SR mode).

Anyway, a blog post that informed me about the IT/SR mode distinctions had a pointer to Supermicro’s FTP site, and there I found some promising “SR” drivers. I had seen references to a megasr closed source driver in some forum posts, but I couldn’t find it from LSI (at least not a RHEL5u3 version). Well, the megasr is apparently what is on Supermicro’s FTP site. I navigated to ftp://ftp.supermicro.com/driver/SAS/LSI/1064_1068/SR/Driver/Linux/ and grabbed a disk image for the version of RHEL 5 we’re using (luckily we hadn’t updated to the newly released 5.4, because there doesn’t seem to be an image for it). I grabbed the .img file and saw that it was a floppy image using the file command. I mounted it loopback and extracted the compiled megasr.ko using the following process:

  • mount -o loop megasr-13.10.0708.2009-1-rhel50-u3-all.img /mnt/
  • zcat /mnt/modules.cgz | cpio -idv

The last command will extract the modules to the current directory (making subdirectories for various kernel versions and architectures). The .img file is compiled for use with the RHEL installer, hence the floppy image format and the modules packed as a gzipped cpio archive. We had a working install on a standalone SATA disk and the SAS drives off of the controller were just going to be used for data storage, so we just needed to add the kernel module after the system was already installed.

So I removed the LSI provided modules (with rpm -e to remove the mptlinux dkms package), copied the appropriate megasr.ko to /lib/modules/`uname -r`/extra, ran depmod, rebooted and everything was finally working in SR mode. Luckily we were using RHEL — it looks like people who are using non-RHEL or SLES distros are out of luck because they don’t seem to provide source to recompile the module against vanilla (or other) Linux kernels.

No Comments | Tags: Linux

1 August 2009 - 13:42Mount LVM-based volumes from loopback full disk images

Recently I needed to extract some files from the root partition in a full disk backup image taken with dd. I didn’t notice when I took the disk image, but the disk only contained two primary partitions: /boot and an LVM physical volume containing the rest of the partitions as LVM logical volumes. I don’t work with LVM much manually, so I had to look up the commands to get it to find physical volumes and activate volume groups. Here’s the full process of mounting LVM logical volumes from a full disk image:

  • There are two ways to get to the LVM partition on this disk, and I’ll cover both: 1) the manual offset finding way and 2) the easy way.
  • First, the easy way: make sure the loopback module is inserted with the max_part parameter, which causes the automatic creation of loopback subdevices for individual partitions. An easy way to make sure is to remove it and re-insert with the right parameter: modprobe -r loop && modprobe loop max_part=63
  • Next, mount the whole disk image loopback: losetup /dev/loop0 sda.img. Now you should see /dev/loop0p1,/dev/loop0p2, etc. for all of the individual partitions. Now you’re already done — you can go directly to the next section to deal with LVM directly.
  • If you can’t do the easy method, mount the whole disk image loopback to look at the partition offsets: losetup /dev/loop0 sda.img
  • Now that /dev/loop0 looks just like the block device image, so check out the partition table sector offsets with fdisk: fdisk -u -l /dev/loop0. I use sector offsets (-u flag) rather than cylinders because they are easier to work with and some partitions may not fall on cylinder boundaries. My image shows something like this:
  • Disk /dev/loop0: 250 GB, 250056737280 bytes
    255 heads, 63 sectors/track, 30401 cylinders, total 488392065 sectors
    Units = sectors of 1 * 512 = 512 bytes
    
         Device Boot      Start         End      Blocks   Id  System
    /dev/loop0p1   *          63      401624      200781   83  Linux
    /dev/loop0p2          401625   488392064   243987187   8e  Linux LVM
    
  • Now, remove the whole disk image from /dev/loop0: losetup -d /dev/loop0 and set /dev/loop0 to just the LVM partition by adding the partition offset. Here, the second partition starts at sector 401625, and each sector is 512 bytes, so the offset is 205632000. Run losetup /dev/loop0 sda.img -o205632000

Now whichever method you used, you have a loopback device with the LVM physical volume partition: /dev/loop0p? (2 in my case) if you used the easy way, or /dev/loop0 if you used the manual offset method. Get LVM to recognize the physical volume and activate the volume groups:

  • Tell LVM to scan for new physical volumes: lvm pvscan
  • Activate the volume groups: lvm vgchange -ay (it will print something like 2 logical volume(s) in volume group "VolGroup00" now active).
  • Now you can finally mount the LVM logical volumes. Run lvm lvs to list the logical volumes. Each should appear in /dev/mapper, typically with the device name (volume group name)-(logical volume name), like VolGroup00-LogVol00.
  • When you are done, unmount all logical volumes and deactivate the volume groups with lvm vgchange -an. Now you can reclaim the loopback device by using losetup -d /dev/loop0.

12 Comments | Tags: Linux