Kamon 0.2.5/0.3.5 has Landed!

11 Nov 2014

We are very happy to announce that Kamon 0.2.5/0.3.5 has been released! This release comes with many improvements and new features that couldn’t be there without the valuable help of our contributors as well as our super awesome users reporting issues, suggesting improvements or giving feedback, to all of you, our most sincere thanks!

About three months have passed since our previous release and a lot has happened during this time, let’s start with our usual compatibility notes:

  • 0.3.5 is compatible with Akka 2.3, Spray 1.3, Play 2.3 and Scala 2.10.x/2.11.x
  • 0.2.5 is compatible with Akka 2.2, Spray 1.2, Play 2.2 and Scala 2.10.x

And the juicy part, here is a description of the most notable changes introduced in this release, if you want more detail about the issues fixed please check our changelog.

Experimental Support for Akka Cluster/Remoting

We blogged some time ago when we introduced support for Akka Remoting and Cluster and since then, a few enhancements were added to ensure that the experience was as smooth as possible, including the separation of all remoting/cluster related code and instrumentation to a new kamon-akka-remote module that you add to your project only if you actually need it. With regards to what it does, it’s basically a more stable version of the first experimental snapshot:

  • First, it doesn’t blow up in your face anymore, previous to this release users of remoting and cluster might experiment errors because some parts of Kamon were not Serializable, making it imposible for them to use Kamon in such projects. Now we did what was necessary to ensure that all messages flow without problems, keep reading for more on that.

  • TraceContext propagation for regular messages: TraceContexts are propagated when sending a message to remote actors in the same way it works with local actors. This also covers messages sent via ActorSelection and routers with remote routees.

  • TraceContex propagation for system messages: If a actor with a remote supervisor fails, the TraceContext is also propagated through system messages to remote actors, meaning that if one of your actors fails in node A and the supervisor lives in node B, the TraceContext will still be there. This also covers the create actor scenario: if you deploy a new remote Actor while a TraceContext is available (e.g. using a actor per request, deploying in a remote node), the TraceContext available in the local node when deploying the remote actor will be available in the remote node when creating the actor instance.

  • A Trace can start in one node and finish in another: As simple as it sounds, start a TraceContext in one node, let it propagate thorugh messages to remote nodes and finish it over there! Sounds simple, but comes with a couple things you need to know about:

    • There is little protection from closing a TraceContext more than once. One of the biggest challenges of providing this support comes from the fact that we use the TraceContext to measure how long did a request/feature execution take to complete, which is an relatively easy task when eveything is local but challenging when working with distributed systems. For now we decided not to provide full protection against finishing a remote TraceContext more than once because the efforts of keeping a consistent view on the state of all TraceContexts across the cluster seems prohibitive performance wise. We might reconsider this in the future based on feedback. If you understand how the TraceContext works and how it is propagated you will be just fine, and if in doubt just drop us a line on the mailing list.
    • Even while all of our timing measurements are taken and reported in nanoseconds for local TraceContexts, when finishing a remote TraceContext the we can’t rely on the relative System.nanoTime and have to resort back to System.currentTimeMillis. Please make sure that you keep your clocks synchronized across the cluter.

Improvements to our Core

The most notable improvement, besides fixing a few issues reported by our users, is that now our Segments have a more clearly defined API that works nicely and that became a solid base to other improvements related to segments that you will see bellow. We also moved from using Option[TraceContext] to the null object idiom, this helps to make a cleaner API for Trace Context manipulation and hopefully will make the life of our Java users less painful. If you are in the Java land stay tuned, we will certainly show more love to you in the days to come!

External Trace and Segment Name Assignment for Play and Spray

As you might already know, we start the TraceContext for you when using Play and Spray and when we do so, we need to assign a name to the trace. The default name that we generate for a trace has the form METHOD: /path which isn’t the thing in the world but at least gives you something to start with. This has one major drawback that is, if there are things that change in your path for each request then you will have different trace names, and you probably don’t want to see metrics for GET: /users/1 and GET: /users/2 but rather metrics for GetUser. The solution to this is simple as we provide a play action and a spray directive that allow you to set the trace name to whatever you need it, but additionaly to that, if you don’t wan’t to mess with your actions and routes and keep your code 100% Kamon-free, you can now provide your own implementation of and kamon.spray.SprayNameGenerator via the and setting keys.

Complete Rewrite of our New Relic Module

Numbers don’t lie, and our Sonatype numbers say that about half of our users are using the kamon-newrelic module and we wanted to make their lives better, that’s why we are now reporting the segments collected via HTTP clients such as spray-client and play-WS, that will take you from only seeing the JVM time in the application overview to something like this:

Include the HTTP client calls as scoped metrics in the Transactions tab:

And allow you to see a breakdown of external services in the External Services tab:

Also, when errors are reported you will no longer see the thread name as the URI but rather the correct trace name assigned to the request. We will continue to improve our New Relic support in the following releases and this rewrite sets a more modular base to work on reporting more metrics and errors. One day, not too far from now, you should be able to report data from Kamon to New Relic without using the New Relic agent at all.

Usability Improvements for our StatS Module

You can now configure the default metric key generator to enable/disable including the host name in the metrics path and provide a fixed hostname if you like, but also, if you just have a totally different metric namespacing convention you can now provide your own implementation of kamon.statsd.MetricKeyGenerator via the kamon.statsd.metric-key-generator configuration setting. We also added router metrics and context switch metrics to StatsD.

TraceContext Propagation with Scalaz Futures

Well, simple but useful, the TraceContext available when creating a Scalaz Future is also available when running the Future’s body as well as all callback related to it, pretty much in the same way as our instrumentation of Scala’s Futures work.

The artifacts are already available in Sonatype, give this release a try and let us know if you have any issue!

comments powered by Disqus