For those that don’t know, Jank is a Clojure implementation but instead of targeting Java, it targets LLVM (Low-Level Virtual Machine). That means, Jank compiles to native code, and interops directly with C++.
Jank already have ways to call C++, but I wanted to do the opposite – to call Jank’s code from C. The reason might not be obvious, so here is an why: writing libraries.
Not all things need to be “stand-alone executables”. One example is libraries for Node, for Ruby, or Python. These languages are amazing on the levels of abstraction they support, and it’s easy to interact directly with code and their runtime (in Node, using Devtools, in Ruby, using something like pry or [Lazuli](https…
For those that don’t know, Jank is a Clojure implementation but instead of targeting Java, it targets LLVM (Low-Level Virtual Machine). That means, Jank compiles to native code, and interops directly with C++.
Jank already have ways to call C++, but I wanted to do the opposite – to call Jank’s code from C. The reason might not be obvious, so here is an why: writing libraries.
Not all things need to be “stand-alone executables”. One example is libraries for Node, for Ruby, or Python. These languages are amazing on the levels of abstraction they support, and it’s easy to interact directly with code and their runtime (in Node, using Devtools, in Ruby, using something like pry or Lazuli, my own plug-in). They are also quite slow, and in some cases, we might need to call some native API that these languages don’t support. So what now? The canonical way is to write some extension in C or C++; now we have to manually manipulate memory and deal with safety issues (and before people say something about it “not being that hard”, it is. Most of CVEs happen because of manual memory manipulation in C – every cast, every printf, every strcpy can cause ACE and/or privilege escalation issues). They are also not interactive so if you’re trying to easily hack some solution, you need to write the code, compile, make a shared library, use the library via the Ruby/Node/Python code, see if it does the thing you want, repeat.
It’s tedious. Maybe with Jank we can speed up this process?
First a disclaimer: Jank currently doesn’t seem to officially support what I want to do. It seems that its creator wants to support the use-case I want later, but right now, this is just a happy coincidence that I can do what I do. So let’s start with a base code:
(ns jank-test)
(defn some-code []
(println "HELLO?"))
Save that to jank_test.jank and let’s compile it with Jank, but instead of making an executable, let’s instruct it to make a library with jank --module-path . compile-module jank-test.
This will generate some build files – in my case, in directory target/x86_64-unknown-linux-gnu-6edc6e02e1bf8d875f77f87b5820996901c1894b142485e01a7785f173afb8df/jank_test.o. You might notice that this is not a shared library – as I said earlier, Jank doesn’t really support what I want to do right now but it will in the future. For now, we can either create a shared library from this .o file or we can create a final binary by linking it together with our code. So let’s do this second choice, because it’s easier: you will now create a C++ file containing:
// (1)
extern "C" {
void jank_load_jank_test();
}
#include <jank/c_api.h>
// (2)
using jank_object_ref = void*;
using jank_bool = char;
using jank_usize = unsigned long long;
extern "C" jank_object_ref jank_load_clojure_core_native();
extern "C" jank_object_ref jank_load_clojure_core();
extern "C" jank_object_ref jank_var_intern_c(char const *, char const *);
extern "C" jank_object_ref jank_deref(jank_object_ref);
int main(int argc, const char** argv)
{
// (4)
auto const fn{ [](int const argc, char const **argv) {
// (5)
jank_load_clojure_core_native();
jank_load_clojure_core();
// (6)
jank_load_jank_test();
auto const the_function(jank_var_intern_c("jank-test", "some-code"));
jank_call0(jank_deref(the_function));
return 0;
} };
// (3)
jank_init(argc, argv, true, fn);
return 0;
}
Lots of things here, so let’s go one by one: in (1), we declared an “external” reference. When Jank compiles a code, it’ll generate these jank_load_<namespace> which will do what it’s supposed to do: load the namespace. Unfortunately, it won’t actually load Clojure Core’s namespace, nor it will load any dependencies (I told you this isn’t officially supported yet! You have been warned!). The “external” will be resolved at linker time, and right now it resides only on the .o intermediate file. In (2) we’ll define some data we’ll need to use later in more “extern” declarations. These are used, again, to refer the Jank library that will be linked with the code.
Now, in (3) (which is the second to last line of actual code) we’ll init the Jank runtime. This will boostrap the “Clojure”-ish classes defined in Jank, and we’ll need to pass a fn argument, that is defined previously in `(4). Without this step, you will get a segfault trying to run Jank code, so this is absolutely necessary.
In (5) we will load “Native core” and “Clojure core”, meaning we’ll start the core libraries that are builtin in Jank (native) and these will be used to implement the clojure.core namespace using Clojure code; in (6), we’ll also load our namespace – the one we defined in our .jank file. And finally, after loading this namespace we’re ready to call our function – we’ll first create a “jank var” using jank_var_intern_c, that will essentially resolver to '#jank-test/some-code, and then we’ll deref it to get back the function. We use jank_call0 to call a function with arity 0, and obviously we can use jank_call1 or jank_call2 if the function have arity 1 or 2, for example.
Finally, to compile your final binary:
clang++ \
-L/usr/local/lib/jank/0.1/lib/ \
-I/usr/local/lib/jank/0.1/include \
test.cpp \
target/x86_64-unknown-linux-gnu-6edc6e02e1bf8d875f77f87b5820996901c1894b142485e01a7785f173afb8df/jank_test.o \
-lclang-cpp \
-lLLVM \
-lz \
-lzip \
-lcrypto \
-l jank-standalone \
-o program
So, test.cpp is the code we just created, and target/x86..../jank_test.o is the intermediate file that Jank compiled. The -L and -I point to directories where Jank was installed, -o program instructs the compiler to output to program, and the rest are just libraries that we need to link to generate the final binary. And that is it – running that binary will print HELLO on the screen!
But of course, Jank can do that by itself, so…
But Why?
Supposing we’re working in some language – for example Ruby – and we want to optimize some code, or integrate with some native library. The canonical way to do that is to use C or C++, sometimes Rust, to make the library. Now, how would we create a class – let’s say Jank – on C++, for it to be usable in Ruby? It’s quite simple, in fact:
#include <ruby.h>
extern "C" void Init_jank_impl() {
rb_define_class("Jank2", rb_cObject)
}
That’s literally just it. Now, suppose we want to send this to Jank, so we will define the class in Jank – how could we do that? Well, it’s also very simple: we will use the same technique in this post, but instead of defining a main code, we’ll keep the extern... code and move all the code that was supposed to be on main to this Init_jank_impl code. Then, on Jank side, we’ll add:
(ns jank-impl)
(defn init-extension []
(cpp/rb_define_class "Jank" cpp/rb_cObject))
That’s it. Can we create Ruby methods, and do more stuff with this? Hopefully! But not right now: while I was trying this approach, I found some bugs in Jank, so until these bugs get fixed (which I suspect, based on how fast the language is evolving, will be very fast) we can’t.
But this might even open some very interesting possibilities, that I expect to expand on a future post!