Wednesday May 20, 2009

Bizarre ftp Behaviour

After spending couple of hours editing a webpage, I lost all that effort when copying the file to my webserver. And of course before the automatic backup had been made. :-(

As always, there is a lesson to be learned.

Try this yourself:

$ ftp webserver

ftp> prompt
Interactive mode off.

ftp> mput *
local: img_2.gif remote: img_2.gif
227 Entering Passive Mode (192,168,32,72,126,33)
150 Opening BINARY mode data connection for img_2.gif.
226 Transfer complete.
12980 bytes sent in 4.2e-05 seconds (3e+05 Kbytes/s)
local: img_3.jpg remote: img_3.jpg
227 Entering Passive Mode (192,168,32,72,126,33)
150 Opening BINARY mode data connection for img_3.jpg.
226 Transfer complete.
28488 bytes sent in 0.017 seconds (1.6e+03 Kbytes/s)
local: webpage.html remote: webpage.html
227 Entering Passive Mode (192,168,32,72,164,239)
150 Opening BINARY mode data connection for webpage.html.
226 Transfer complete.
12498 bytes sent in 4.5e-05 seconds (2.7e+05 Kbytes/s)

ftp> ls *
227 Entering Passive Mode (192,168,32,72,219,226)
150 Opening ASCII mode data connection for /bin/ls.
total 268
-rw-r--r--    1 wwwillem wwwillem    12980 May 20 09:08 img_2.gif
-rw-r--r--    1 wwwillem wwwillem    28488 May 20 09:08 img_3.jpg
-rw-r--r--    1 wwwillem wwwillem    12498 May 20 09:08 webpage.html
226 Transfer complete.

ftp> ls -l
227 Entering Passive Mode (192,168,32,72,207,63)
150 Opening ASCII mode data connection for /bin/ls.
total 268
-rw-r--r--    1 wwwillem wwwillem    12980 May 20 09:08 img_2.gif
-rw-r--r--    1 wwwillem wwwillem    28488 May 20 09:08 img_3.jpg
-rw-r--r--    1 wwwillem wwwillem    12498 May 20 09:08 webpage.html
226 Transfer complete.

So far all has gone well. Notice the file-size of webpage.html, 12 kB.

Now comes the problem:

ftp> ls -l *html
227 Entering Passive Mode (192,168,32,72,142,146)
150 Opening ASCII mode data connection for /bin/ls.
226 Transfer complete.

ftp> quit
221-Thank you for using the FTP service on webserver.
221 Goodbye.

$ ls -l webpage.html
total 62
-rw-r--r-- 1 willem willem   209 May 20 09:07 webpage.html

Notice how the file has shrunk from 12 kB to only couple of hundred bytes? Let's have a look at the (new) content.

$ cat webpage.html
total 123
-rw-r--r-- 1 willem willem 12980 May 20 09:01 img_2.gif
-rw-r--r-- 1 willem willem 28488 May 20 09:01 img_3.jpg
-rw-r--r-- 1 willem willem   209 May 20 09:07 webpage.html

What happened here? The 'ls *' ftp command was ok, same for 'ls -l'. However when we did 'ls -l *html', the result of the 'ls -l' command was written to the local html file, in this case webpage.html. That's definitely not what I would have expected to happen. Very weird behaviour!!

Finally, the client was ftp on CentOS 5 and the server was an ancient RedHat 6.2 server. And no, 6.2 is not the successor of EL5. :-)

Monday Jul 14, 2008


It happens too often that at customer sites there are issues around the IP address for the Service Processor. The proper way to handle this is IMHO simple: For each server, add an entry to the DHCP server, where based on the MAC address of the SP, a known IP address will be assigned. This way, everything is controlled by a centralized DHCP configuration but still each server gets a "semi static" IP address.

Unfortunately, in many situations customers can not implement this, or they simply don't want to. The "can not" is most likely not based on technical arguments, but has mainly to do with organizational and responsibility issues.

Your second option is to set the Service Processor IP address from the BIOS. Which works fine, but sometimes it can be hard to find a monitor and keyboard in a Data Center. Or nobody is willing to give you a free static IP address. Third option is to let the SP do a DHCP request and monitor the logfiles on the DHCP server to see what address was handed out. Which won't work if the person who needs to use the SP has no access to the DHCP server.

The end result is that you can easily end up in a Catch 22. In the old days of V20/40z servers, we had those tiny LCD screens and you could even set the IP address using a few buttons on the front of the server. But the newer generation doesn't have those features anymore.

Last week, I was again confronted with this problem (on a corporate network, where I had zero privileges) and I solved it in a different way. What I did was write a little script that tries to ping every IP address in the subnet, or better in the range that is available to the DHCP server. Kind of "poor man's port scanner". The script (let's call it "pingscan.sh") is pretty primitive and looks like this.

for i in 13 14 15 16 17 18 19 20 21 22 23 24; do
for j in 0 1 2 3 4 5 6 7 8 9; do
ping -n -t 1 192.168.1.$i$j 1

This will scan addresses from to, adjust the values for your network. The "-t 1" and the "1" at the end (this is Solaris ping) will take care that with one second gaps, each address in the subnet will be tried. For Linux use "-c 1" and no trailing "1". So, the script will take a few minutes to complete, depending on the range.

Fire up your server and let the SP do a DHCP request. Then run "pingscan.sh > before". And here comes the trick! Disconnect the network cable to the SP and run "pingscan.sh > after". A simple diff of the two files will show which IP address was given to the SP.

# diff before after
< is alive
> no answer from

The script could be made much fancier, but this one only uses /bin/sh and can be typed in a couple of minutes. On a large and busy network it could happen that you will get multiple candidates. And of course this is not a preferred solution, because it isn't guaranteed that a week later the DHCP server won't give a different IP address. But this trick can help when you find yourself in a nasty corner. At least, it did that for me.

Friday May 11, 2007

Network Speed with Zones

A little time back I was preparing for a big benchmark project where our customer wanted to compare a single large system using many zones with a more horizontally scaled infrastructure, consisting of a number of smaller servers, like V490 and V890. I immediately thought that replacing a number of servers, being chatty over the network, with a single server, carved up into zones, would give a big benefit in network performance. Zone-to-zone network traffic should be faster than server-to-server. So I fired off some emails to people that I thought would give me the final answer, but the responses were very mixed.

Therefore it was time to do some of my own experiments. Doing a big benchmark in one of the Sun Solution Centers, I had the availability of some serious hardware for these tests. On the other hand, as is usual with these types of projects, there was a lot going on at the same time, therefore in the end time was limited for this little exercise.

This was my test platform:

  • A few 8 CPU / 16 core domains (1800 MHz US-IV+) on E25K.
  • Couple of quad Gigabit Ethernet cards, connected to a SMC switch.
  • We had to use the 'ce' network drivers, because there were incompatibilities with other ones.

This is the environment I built:

  • domain A, zone 1, IP, interface ce2:1 - used for sending files
  • domain A, zone 2, IP, interface ce2:2 - receiver, using the same physical interface, but with its own virtual interface
  • domain A, zone 3, IP, interface ce3:1 - receiver, having its own network interface, but different from the one used by the sender
  • domain B, zone 1, IP, interface ce4:1 - receiver, now a completely different domain, so will only communicate with sender over the copper wire

This provided us with three test scenario's: a) network traffic from one virtual interface to another, both on the same physical interface, b) two zones talking with each other, each with their own physical interface and c) two independent servers, or in this case domains.

I used ftp to send files of three different sizes: 1M, 3M and 1G bytes. All files were created in /tmp and sent to /tmp. I repeated each test three times. Here are the results (all times in secs):

same interface
other interface
other interface
1 MB 0.0072
3 MB 0.27
1 GB 6.7

So, from this we can see clearly that zone-to-zone traffic doesn't "hit the copper" and probably gets shortcutted somewhere in the IP layer of the TCP/IP stack. I would think that with slower interfaces, like 100 mbps, the speed advantage will be even higher than the 1.5-2x we see here.

Monday Jul 24, 2006

Network Card for Solaris X86

It's the kind of thing you don't have to do very often, because the Operating System install takes care of it so well. Even to the extend that you are tempted to just reinstall the OS when adding some new hardware to your system. In this case I needed to add two 3Com network cards to an Ultra-20 that was already configured for the onboard Ethernet. I know how to do it under Linux: just start the GUI config tool. With Solaris, it's a bit more of a manual process. But, in the end not too tough, and when you get stuck, Google is your friend.

I first checked the Solaris FAQ at www.sun.drydog.com. It was not 100% accurate (probably based on an older Solaris version), but a very good starting point. Manually configuring a network with ifconfig is something I've done often enough. But the issue for me is that I don't know which device/driver name to use. In Linux this is simple, it's always "eth0", but in Solaris it depends on the driver.

After adding the network cards and rebooting I did a PCI scan:

bash-3.00# /usr/X11/bin/scanpci
pci bus 0x0000 cardnum 0x0a function 0x00: vendor 0x10de device 0x0057
 nVidia Corporation CK804 Ethernet Controller
pci bus 0x0001 cardnum 0x09 function 0x00: vendor 0x10b7 device 0x9050
 3Com Corporation 3c905 100BaseTX [Boomerang]
pci bus 0x0001 cardnum 0x0a function 0x00: vendor 0x10b7 device 0x9050
 3Com Corporation 3c905 100BaseTX [Boomerang]

You see the onboard Ethernet Controller and then the two 3Com cards. The important part is the vendor and device numbers. With these two, we now have a look at:

bash-3.00# grep 9050 /etc/driver_aliases
elxl "pci10b7,9050"

This gives us the "elxl" driver name I was looking for. Alternatively, you can have a look at:

bash-3.00# grep 9050 /boot/solaris/devicedb/master
pci10b7,9050 pci10b7,9050 net pci elxl.bef "3Com 3C905-TX Fast Etherlink XL 10/100"

To take care that Solaris "picks up" the card, you need to do a "touch /reconfigure" and then restart your system with "reboot" or "init 6". The FAQ says that you then have to press 'Esc' during the driver configuration, but that's not the case (anymore). After rebooting, it's time to configure the network interface. First by hand:

bash-3.00# ifconfig elxl0 plumb
bash-3.00# ifconfig elxl0 netmask
bash-3.00# ifconfig elxl0 up
bash-3.00# ifconfig elxl0
bash-3.00# ping
bash-3.00# ifconfig elxl0 down
bash-3.00# ifconfig elxl0 unplumb

And when that works fine, (assuming "moon" is the hostname) make it permanent with:

bash-3.00# echo "moon" > /etc/hostname.elxl0
bash-3.00# echo " moon" >> /etc/hosts
bash-3.00# svcadm restart network/physical