Tips, tricks, missteps, and minor revelations on the path to
Scala wisdom.
Propagating View Bounds to Traits
View bounds provide the means by which to achieve ad-hoc polymorphism in Scala. One way to think of view bounds (these are also equivalent to type classes in Haskell) is that they provide a sort-of run-time polymorphism. To review:
class A[T <% String] {
// Methods here can now operate operate treating any type `T' as
// `String'
def sayHey(x: T): String = "hey! " + (x:String)
}
Here, you can only instantiate the class A with the type T if there is, at the time of instantiation an implicit function available of type T => String.
Note that the <%-notation is syntactical sugar in Scala, making the previous example equivalent to:
class A[T](implicit tToString: (T => String)) {
// Methods here can now operate operate treating any type `T' as
// `String'
def sayHey(x: T): String = "hey! " + (x:String)
}
However, view bounds break down when you introduce self-type annotated traits.
class A[T <% String] {
// Methods here can now operate operate treating any type `T' as
// `String'
}
trait SomeTrait[T] { self:A[T] =>
// this fails to compile:
def sayHey(x: T): String = "hey! " + (x:String)
}
Even though we used a self-type annotation, the compiler doesn’t make available the implicit function available in the “parent” class. The problem is the way the syntax-sugar works here. It binds the implicit to the lexical scope of the parent class.
However, if we could change the implicit to be a val instead, we’d be set. This works:
class A[T](implicit val tToString: (T => String)) {
// Methods here can now operate operate treating any type `T' as
// `String'
def sayHey(x: T): String = "hey! " + (x:String)
}
trait SomeTrait[T] { self:A[T] =>
// this fails to compile:
def sayHey(x: T): String = "hey! " + (x:String)
}
However, you forgo the succinctness of the view-bound notation, and it doesn’t nest: adding another trait
trait SomeSubTrait[T] { self:SomeTrait[T] =>
// this fails to compile:
def sayHeyAgain(x: T): String = "hey again! " + (x:String)
}
will again fail to compile.
I ended up working around this by a simple trick (and, I think, a prettier solution to boot) by simply “exporting” the implicit definition.
class A[T <% String] {
// Methods here can now operate operate treating any type `T' as
// `String'
implicit def tToString(t: T): String = t
def sayHey(x: T): String = "hey! " + (x:String)
}
trait SomeTrait[T] { self:A[T] =>
// the implicit from `A' is now available here.
def sayHey(x: T): String = "hey! " + (x:String)
}
This still does not nest by itself, but you can work around that by again re-exporting the implicit made available to the trait.
—
Marius A. Eriksen,
Apr 26, 2010
Warning About Tail Calls
We ran into a scary case last week where a code change could reliably make the JVM (1.6) segfault
every 3-4 hours. At first, I thought it might be a “cosmic ray” kind of phenomenon, like a
particular alignment of the bytecode or some edge case in the scala (2.7.7) compiler, but as we
started rolling back the code to triage, it was clear that some particular piece of code was
reliably causing the segfault. Code bisection pinned it down to a tail-call recursion:
def foreach(blocking: Boolean, f: Job => Unit) {
val continue = {
getJob(blocking) match {
case None =>
false
case Some(job) =>
f(job)
true
}
}
if (continue) {
foreach(blocking, f)
}
}
(The code is simplified a bit for clarity.)
The tail-call recursion is pulled out of any potential
block because otherwise the recursive call may be nested inside an inner class, which would prevent
scalac from being able to create a tail-call. But it doesn’t seem to have helped, because a dump of
the compiled bytecode shows that scalac failed to create a tail-call anyway:
186: iload_3
187: ifeq 196
190: aload_0
191: iload_1
192: aload_2
193: invokevirtual #245; //Method foreach:(ZLscala/Function1;)V
196: return
Translation: If local-3 (continue) is false, return; otherwise, do a normal java recursive call.
Possibly the JVM doesn’t correctly catch a stack overflow inside java (?) and the segfault was just
the java thread running out of stack space for recursion.
For now, I would recommend avoiding tail calls. It’s very tricky to figure out when scalac will be
able to avoid actual java-stack recursion. Apparently scala 2.8 will have an annotation you can add
to methods to make the compiler warn you when it can’t properly optimize the recursion into a tail
call.
—
Robey Pointer,
Jan 11, 2010
New Year’s Housekeeping
The migration from Blogger that I did a while back wasn’t without its warts. For one, it lost per-post authorship attribution. There were also sundry formatting errors, and a number of older posts that get the benefit of syntax highlighting for inline code.
I’ve moved this site over to GitHub Pages and corrected the above issues. The layout and styling has also been tweaked a bit.
An advantage of moving to GitHub: if you’d like to contribute a post, simply fork the repository and add a post to the _posts directory. Follow the convention of adding your name to the author key of the YAML frontmatter. Send me a pull request, and I’ll post your contribution.
—
Alex Payne,
Jan 02, 2010
A Generic Thrift Deserializer In Scala
So you’ve got a file full of binary serialized Thrift objects, delimited by the size of the serialized objects in bytes. Maybe that file is gzipped, and maybe it isn’t. You need those objects. And wouldn’t it be nice if the code was reusable? Check ’er out:
How do you put it to use? Simple:
val scanner = new ThriftFileScanner[YourThriftClassName]
scanner.allRecordsFromFile("/tmp/dump.thrift.gz") { println(_) }
Yup, you just hand it the path to a file and a block. As each record is deserialized, it’s handed to your block.
Notes:
1. If the file o’ serialized Thrift objects is big enough, you might exhaust your JVM’s heap space. This seems to happen more frequently with gzipped files. Update, Tuesday, December 1, 2009: We’ve sorted this out. We weren’t wrapping the GZIPInputStream correctly, but all is well now, even for big honkin’ files. The Gist has been updated, and we fixed a couple other bugs and removed some unnecessary junk while we were at it.
2. Check out the use of manifests and type bounds. What we’re saying in the class definition is “t his class accepts any subclass of TBase, and we’re referring to whatever that is as T, and we’re going to stick in formation about T into a variable called ‘man’ at runtime.” T is a placeholder, and manifests give us information about what ends up in that place.
Later on, we create a new instance of whatever T might be (in the above example, it’s YourThriftClassName) by calling the erasure method on man, which gives us back the class of T. Runtime reflection while remaining type safe. Cool. This is the first example I’ve seen that uses both manifests for genericity and higher-order functions. Not that doing so is particularly difficult, it just doesn’t seem to have come up elsewhere.
So! Go forth and deserialize, friends.
—
Alex Payne,
Nov 25, 2009
Building Thrift Dependencies In Scala with sbt
So you’re using the fantastic simple-build-tool (sbt) to build your Scala project. Not only that, but you’re using Thrift for cross-language, high-performance RPC support. Nothing but the latest and greatest for you, eh?
Out of the box, though, sbt knows nothing about Thrift. Fortunately, it’s easy to wire that up. You just want to throw something like this in your build file (you know, the one in project/build that you created while following along with the superb documentation that sbt offers?):
lazy val thrift = task {
val javaDirectoryPath = "src/main/java"
val rubyDirectoryPath = "src/main/ruby"
val thriftFile = "src/main/thrift/YourThriftDealie.thrift"
"thrift --gen java -o %s %s".format(javaDirectoryPath, thriftFile) ! log
"thrift --gen rb -o %s %s".format(rubyDirectoryPath, thriftFile) ! log
None
} describedAs("Build Thrift stuff.")
override def compileAction = super.compileAction dependsOn(thrift)
override def compileOrder = CompileOrder.JavaThenScala
Now, keep in mind that this just handles a single Thrift definition, and only produces generated code in Java and Ruby. But, at least it’s enough to give you the groundwork to do fancier things, like generate code in umpteen different languages for a hojillion different Thrift definitions.
Note that the CompileOrder.JavaThenScala assignment is critical; without it, your Scala code won’t know about anything that Thrift generated, because it won’t have been built.
So far, we’ve had great success with sbt. We took a project with 500+ lines of Ant and Ivy XML muck down to one tidy pure Scala build file that’s not even 80 lines log, including comments. It’s magic! Except that it isn’t, because if something is puzzling you can just go and read the sbt source code, which is all clever, idiomatic Scala.
Update, November 28 2009: sbt’s author, Mark Harrah, was nice enough to email us some pointers.
- I recommend always using absolute paths when calling out to an external program. If you ever use multi-projects, for example, the current working directory is always that of the root project and can be surprising when in subprojects.
- There is an execTask that fails if the forked process fails. (Right now, your thrift action will always succeed).
- I use src_managed for generated sources to keep them in a separate hierarchy. It makes it easier to exclude from version control and easier to clean.
- If thrift doesn’t clean the output directory before running, you can make your thrift task clean the outputs first.
Mark suggests that we could refactor the above as:
def javaDirectoryPath = "src_managed" / "main" / "java"
def rubyDirectoryPath = "src_managed" / "main" / "ruby"
def thriftDirectoryPath = "src_managed" / "main" / "thrift"
def thriftFile = thriftDirectoryPath / "YourThriftDealie.thrift"
def thriftTask(tpe: String, directory: Path, thriftFile: Path) = {
val cleanIt = cleanTask(directory) named("clean-thrift-" + tpe)
execTask {
// you can do "thrift ...".format and pass a String here instead of inline xml
<x>thrift --gen {tpe} {directory.absolutePath} {thriftFile.absolutePath}</x>
} dependsOn(cleanIt)
}
lazy val thriftJava = thriftTask("java", javaDirectoryPath, thriftFile) describedAs("Build Thift Java")
lazy val thriftRuby = thriftTask("ruby", rubyDirectoryPath, thriftFile) describedAs("Build Thrift Ruby")
override def compileAction = super.compileAction dependsOn(thriftJava, thriftRuby)
This assumes that the order of Thrift complication between Java and Ruby doesn’t matter (which, in our case, it doesn’t). Thanks, Mark!
—
Alex Payne,
Nov 24, 2009