Steve: Developing on the Edge
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
Page1234567891011121314151617181920
153 - 160 of 1263
9Dec
Tue2008
The mediocre firewall of the UK meets spinal tap

I am now fortunate in that I can now browse to a wikipedia entry on a 1970s rock band album. The kind of album and album cover that Spinal Tap made fun of. And what a gift it has proven to be, one that just keeps giving. Now so many more people round the UK have heard of the band, seen the cover and are aware that all the consumer ISPs are filtering all your HTTP requests through a transparent proxy

Such is progress. The China firewall works. They've got all suppliers to the state, from the apps -skype- to the network switches to conspire to block forbidden keywords and sites. Here our ISPs can't even censor 1970s rock band artwork reliably.

9Dec
Tue2008
limitations of a file-driven CM world view

I am upgrading my desktop from Ubuntu 7.10 to 8.10 by way of 8.04. It was not intentional, but I made the mistake of upgrading to SmartSVN 5 last week, which silently updated my SVN databases to a version that is no longer compatible with the 7.10 command-line tools. A bulk upgrade seems like the best way forward. So, I have to use the laptop while the desktop churns away. With the Janet network two router hops away the download is fast (1 MB/s), so within 10 minutes of starting the update process I am presented with the first "a configuration file you never knew existed and whose syntax is alien to you has changed, would you like to see it" dialog.

This shows a flaw in .deb and .rpm systems. Their world view is everything is a file, files belong to specific packages. But really, it is file state changes that belong to packages. If I install samba, and it creates a user samba in /etc/passwd, it is deploying a state change to that file "insert a new line"; only that line should be managed as part of the package. Furthermore, changes to the system that don't invalidate that change "there is a samba user in /etc/password" are valid, with no need to bother me.

This doesn't mean I don't think the Linux package managers aren't wonderful -they are- but that linux itself doesnt have a world view that is 100% compatible. Unless the package managers move beyond file ownership, the various parts of the system need to support aggregatable files, so that the standard file create/overwrite/delete operations that RPM/deb installers do can be used to manipulate the state of the machine without creating conflicts over who owned or edited a configuration file

1Dec
Mon2008
Presentation: My other computer is a datacentre

As other ApacheCon attendees will know, the title of this new presentation, My other computer is a datacentre is based on Fitz's Google Code slogan, only spelt correctly for en-gb. Brian actually provided the 40+ stickers needed to hand out one to every student in the CS course who got to see the lecture, and Tom White graciously brought them back from ApacheCon US to Wales, from whence The Royal Mail got them to my house, and then by hand to the university. In compensation, Fitz gets a slide of his own. As do Yahoo!'s Hadoop team, and our household's deployment project, who got to see a datacentre when he visited my site. The cold air coming up from the floor was right up there with the blinking lights as the key features of the room.

This is a talk on the engineering aspects of datacentres, looking at some of the implications they bring to software that runs on them. It's a the sequel to Farms, Fabrics and Clouds, which listed what assumptions were no longer valid, without exploring the implications.

What you don't see in the presentation is the bits where I go off on a ramble, mainly on Power. The photo of the sunset is from a motel in Hood River, Oregon, looking at the Columbia, which is near where all the PNW datacentres are based: all that water running through dams. Google's Dalles facility is a few miles upriver behind the camera, further again comes the MSN facility, amongst others.

I gave some working MapReduce demos in this talk. Paolo has been teasing me about not bothering to write my own MR code, focusing on deploying Hadoop instead. This is my response. Six months worth of scanned Bluetooth devices from my house turned into Erlang Records, fed through a derivative of the MapReduce engine included in the Programming Erlang book. My derivative not only applies the mapper for every record in the source datafile(s) (doing each file in parallel), it correctly terminates when there is a programming error in the mapper, forwarding the error to the shell. I have found that useful. I can now show the breakdown of devices seen by day, by hour, which devices get seen the most times or for the longest duration, etc, all in a few lines of functional programming code. Nice. This doesn't mean I am an unequivocal fan of Erlang, only that it has some features that I appreciate. Like native list and tuple support, and dynamic function creation. I'd need more time with its processes before I can conclude whether that is good or bad. "Interesting", is all I will say there for now.

I will play more with Erlang/my Bluetooth data and maybe write something up on it. For now, the Bath Bluetooth Study is the closest published paper on street-based Bluetooth monitoring of mobile devices.

28Nov
Fri2008
Ubuntu 8.10 rollout complete

I've now pushed out Ubuntu 8.10 to two laptops, and clean installed it on my home desktop. I've left the work desktop alone as problems there translate into serious productivity problems, and it's not worth the hassle.

On a laptop, the mobility and power management features make this a really good mobile Linux. Its the first Linux that feels mobile, rather than a Linux that fits on a laptop but doesn't like to be moved. On a desktop, there isn't so much compelling need to upgrade

  1. I can't get multi-monitor working except with a reboot while the second monitor is plugged in. At least reboot is fast!
  2. No sound on either laptop. One is a known problem -the other, the sound went away
  3. Gnome Network Manager is trouble. On the laptops it has quirks: it wants to go offline on resumes, doesn't always select the nearest wifi network (this could be card trouble, of course)
  4. On the static desktop, Network Manager keeps stamping on resolv.conf, but doesn't remember its own DNS entries across reboots. You need to kill Network manager.
  5. Network Manager/Ubunut does handle 3G wireless USB dongles nicely: wizard driven up and running. Slick.
  6. For the desktop, wicd is a better alternative.
  7. Once you run VMWare, cursor keys start playing up across the whole X session

I initially thought the static DNS entries problem was due to me doing an upgrade of a previous install, so reformatted the root partition and installed the OS clean. After much grub work that came up, but the problem remained -and at least two other people I know have the same problem. This tells me something about the Network Manager team: they use DHCP for everything.

This leads me to restate "Loughran's Law of Networks", which is this:

Networked applications work best in a network architecture which matches that of the development team

Outlook and Windows explorer networking work if there is a fast, high-availability link to the servers. Java works reliably if reverse-DNS is always fast and accurate, and assumes that machines never change IP Addresses during the life of an application. Most open source apps have an awful time with web proxies, as its not something they have encountered. And Network Manager, well, it may work on the move, but not on static systems. Which is progress, of a sort.

26Nov
Wed2008
Farms, Fabrics and Clouds (slightly updated)
I've updated our Farms, Fabrics and Clouds talk, as given to the local university students this week.
25Nov
Tue2008
What should sun do?

Fascinating article by Tim Bray, what should Sun do

+1 to the back-off-from-the-client story. It's over. JVMs of all kind are commoditised. Unfortunately, Sun still have to spend lots of money keeping the windows JVM alive and staying up with MS on features, even though MS have the windows franchise to fund them. As the only people who use Java client apps are us developers, why not force everyone to move to unix? And make Java integrate with Unix way, way better, by not seeking a lowest common denominator of platforms.

Server side, the future is apps running out outsourced datacentres. Some of the power budget options of their CPUs are interesting, and I'm sure Hadoop would run well on their clusters. Rather than worry about Glassfish-the-open source alternative to websphere, they should be thinking about glassfish-the-pay-per-CPU-hour version, running on Sun kit with Hadoop running on an HA filesystem behind it. If the power story is good, this could work.

One problem here is the development GUI in Java land is not netbeans. It's Eclipse. Whether you like it or not

23Nov
Sun2008
Visions of a future

Some lovely articles in the NYTimes this week, if you can get past their cookie policy (hint: delete all their cookies). First, one on the NetFlix competition. There are some interesting questions about human-nature itself lurking here. I was at a university lecture on the state of the AI community last week, where the lecturer observed that we do have the equivalent of "skynet" up and running -the large computer systems modelling human behaviour- but all they were doing was a statistical hack kind of AI to recommend other goods and place adverts better. The dreams of the AI community from 50 years ago for self aware machines isn't there, and its not clear that tuning the current algorithms is enough. Of course, the NYT doesn't get into looking at the whole AI research agenda, merely cinema recommendations. What interests me is whether there is correlations across products: can you predice what videos people will like from their book purchases, what they do in their spare time, everything. Can you model people using statistics alone? I also wonder where else such recommendation systems could be extended to -things in the real world?

The other article is on the ubiquity of screens.. It's something to make you think. Yes, I really do spend most of my day staring at screens. Work: LCD. Home, TV + laptop, or both, as here I have to sit in the room while the sprog watches the original Star Wars for the third time this weekend and I try not to get too bored. What the article does discuss is how online video has evolved to be shorts 2-10 minute videos made by the community itself. Again, the sprog shows the way here: he doesn't watch kids TV. He'd rather have time on a laptop looking for stop-motion mars-mission lego animations on youtube. Why? Content that appeals to him. We have also discovered the merits of online video in our local political campaigns. A video whose URL you can post out is a fantastic way of getting your message across. That's a very interesting form of democratisation at work. Whereas before you had to get the interest of the local, regional or national TV channels to get a message out, now you don't. That's going to move power from the legacy media companies to the new web hosted players, and to all the people who upload the artifacts in the first place.

Anyway, some interesting reads. An online version of a printed paper discussing how ubiquitous screens and community publishing is going to change media. Hmmm. And this on the same day as an article discussing whether irony is dead.

22Nov
Sat2008
Ooh, a petabyte

Google give their times to sort a Terabyte on GFS/MapReduce in one of their datacentres: the record to beat is now 68s.

What is really impressive is they then went on to sort a full petabyte in 6 hours. Which means one petabyte in, one petabyte out, one petabyte for intermediate bits and all of this stuff replicated 3-ways: 6 petabytes all in. Spare. I guess this will become a new way of stress testing a new datacentre: get it to sort for a while.

There's a bit of commentary on "straggler management"; these can be trouble in Hadoop -and in BitTorrent. The slow machines end up as a bottleneck.