Hadoop error: java.io.IOException: No FileSystem for scheme: http

Posted on July 28, 2008

If you see this kind of error:

2008-07-28 10:54:08,747 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: /home/james/dfsTmp/mapred/system
java.io.IOException: No FileSystem for scheme: http
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1277)
        at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1291)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:203)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:108)
        at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:717)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:141)
        at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2319)

You’ve probably set this in hadoop-site.xml:

<property>
        <name>fs.default.name</name>
        <value>http://localhost:54310</value>
    </property>

Don’t use http:// in front of localhost - in 0.17.1 it should look like:

<property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:54310</value>
    </property>

FoxyProxy, Hadoop, and SOCKS

Posted on June 03, 2008

FoxyProxy makes it very easy to talk to your Hadoop cluster running on EC2.

Run ssh with the -D command:

ssh -D 2324 ec2-75-101-XXX-XX.compute-1.amazonaws.com

Tell FoxyProxy to “use SOCKS proxy for DNS lookups” (tools > foxyproxy > more > global settings > use SOCKS proxy for DNS lookups)

Configure foxyproxy with rules for when to use local port 2324. Use wildcards like httpec2internal*.

All the features I cared about worked when set up this way.

(And of course the choice of 2324 isn’t special - use any port you like.)

Here’s a screenshot of the the proxy settings I use. It’s been a while since I configured it - I suspect that at least one of these lines is obsolete:

screenshot