Reducing complexity in UML tool development

Perhaps the biggest challenge with UML is it's sheer size and complexity. UML is complex in many obvious ways: conceptually, syntactically and semantically. However, it is also complex in a less obvious way - it is complex for tool developers to implement. It is this form of complexity that I want to talk about here because I think that at least some of it can be reduced.

Think about what a UML tool has to do in order to simply render a UML class model:

  • It reads in a textual representation of the model. This is should be in XMI (XML Metadata Interchange) format, but it is often in some proprietary format.
  • It has to convert this into an abstract syntax tree (AST) which is implemented in whatever programming language the tool is written in.
  • It has to walk the tree and render the graphical elements.

When it saves the model to disk, it has to do the first two steps of this in reverse.

So there's quite a lot of complexity here, but much of it (I suggest) is unnecessary and may be removed by using the appropriate tools. As a starting point, let's assume that the UML tool is written in Java. Think about the transformation:

XMI -> Java

Wouldn't it be better if we could get rid of this transformation entirely. Conceptually it is quite unnecessary. The XMI represents the AST of the UML class diagram reasonably well. It is purely a matter of pragmatics that we have to convert that XML AST to Java, a language that has no built-in support for ASTs. This, of course is another problem. Our target language has to implement some sort of framework to represent the AST of the UML class diagram. This is another level of complexity which, as we will see, is entirely unnecessary, provided the right development tools are used.

If instead of using XML and Java, we were to use a language that has built-in support for ASTs, then this whole area of complexity would simply vanish. Such a language is Lisp. In particular, Clojure is a modern Lisp that runs on the JVM and allows access to all of the Java libraries. This would seem to be the optimal choice for developing a UML modelling tool. The big advantages of Clojure over Java for this task are:

  • Clojure is homoiconic - Lisp code is Lisp data. This makes metaprogramming a breeze. It means that the AST and the code that manipulates it can be in precisely the same language (Lisp). 
  • Clojure (Lisp) syntax actually is an AST - no transformations or class libraries are necessary. 

So a compelling vision for a modern UML tool would be:

  • Textual representation implemented as Clojure code which is already an AST.
  • Graphical rendering - Clojure leveraging the Java Graph libraries.

Essentially, everything is Lisp! XMI can trivially be emitted from an AST in Lisp simply by walking the tree. Similarly, (but slightly harder), the Lisp AST may be constructed from XMI if needed. This last step is slightly harder because of all of the cruft that XML adds to the AST representation.

Think about this further. UML has the Object Constraint Language that allows constraints to be stated on UML models. Because the AST is Lisp, and Lisp code is Lisp data, implementing a constraint language on a Lispy representation of UML is trivial. No new language is required. The same is true for UML transformation and action languages. All can be implemented as (at worst) a very thin layer on top of an underlying Lisp substrate.

Here is a simple example of a "Lispy" representation of a simple UML class model:

(def m1 (model "M1"

          (package "DataTypes"

            (datatype "int")

            (datatype "string")


           (package "P1"

             (package "P2"

               (klass "C2"

                 (attribute "a1" "DataTypes::int")

                 (attribute "a2" "DataTypes::string")

                 (attribute "a3" "P1::P3::C1"))


               (package "P3"

                 (klass "C1"

                   (operation "op1")


This representation is just Lisp (Clojure) code. It may be executed to create an in-memory AST. It may be saved to disk in this format, then read back in and executed. It is inherently human readable.

It's not magic - model, package, klass etc. are all functions. Calling these functions generates the AST, which is also Lisp code. Each function looks like this:

(defn klass [name & params]

  {:metaclass :class :name name :id (gensym) :visibility "public" :elements (set params)})

Yes - that's right - just a single line of code that returns a map with a nested set of child nodes. 

The generated AST looks like this:

{:metaclass :model, :name "M1":id G__758, :elements

 #{{:metaclass :package:name "P1":id G__757, :visibility "public":elements

    #{{:metaclass :package:name "P3":id G__756, :visibility "public":elements

       #{{:metaclass :class:name "C1":id G__755, :visibility "public":elements

          #{{:metaclass :operation:id G__754, :name "op1":visibility "public":elements #{}}}}}}

      {:metaclass :package:name "P2":id G__753, :visibility "public":elements

       #{{:metaclass :class:name "C2":id G__752, :visibility "public":elements

          #{{:metaclass :attribute:name "a3":type "P1::P3::C1":id G__751, :multiplicity "1":visibility "public"}

            {:metaclass :attribute:name "a1":type "DataTypes::int":id G__749, :multiplicity "1":visibility "public"}

            {:metaclass :attribute:name "a2":type "DataTypes::string":id G__750, :multiplicity "1":visibility "public"}}}}}}}

   {:metaclass :package:name "DataTypes":id G__748, :visibility "public":elements

    #{{:metaclass :datatype, :name "string", :id G__747}

      {:metaclass :datatype, :name "int", :id G__746}}}}}

As you can see, there is a bit more complexity in the AST, but not really very much. In particular, there are no frameworks, no design patterns (Visitor etc.), just pure Lisp. The AST is constructed from maps - it is very simple and flexible, and there is no need for an OO representation at all. The transformation from the Lispy UML to AST is trivial.

Similarly, redering the AST as XMI is trivial - just walk the tree and call the appropriate rendering function for each node. We use a polymorphic function called emit-xmi that is dispatched on the metaclass of the node. These functions look like this:

(defmethod emit-xmi :class [params]

  (prxml [:ownedMember { :isAbstract "false" :isLeaf "false" :name (params :name) :visibility "public" :xmi:id (params :id) :xmi:type "uml:Class"

          (map emit-xmi (params :elements))]))

The function uses the Clojure prxml library to emit the XMI. Again, note how simple and direct this is. No Visitors or any of the rest of the Java cruft.

Because the AST is just Lisp, rather than (Java + some framework + some design patterns), constraint, transformation, human readable and action languages are also all just Lisp.

So perhaps we have been missing a trick. Choosing the right tool for the job makes all the difference.

© Clear View Training 2012