Creating a binary from Julia code
This section is for people who want to understand PackageCompiler.jl under the hood. It is not required reading to use the package.
This section targets how to build an executable based on the custom sysimage so that it can be run without having to explicitly start a Julia session.
Interacting with Julia through libjulia
.
The way to interact with Julia without using the Julia executable itself is by calling into the Julia runtime library (libjulia
) from a C program. A quite detail set of docs for how this is done can be found at the embedding chapter in the Julia manual and it is recommended to read before reading on. Since this is supposed to highlight the internals of PackageCompiler, will not use the conveniences shown in that section (e.g. the julia-config.jl
script) but it is good to know they exist.
A rough outline of the steps we will take to create an executable are:
- Create our Julia app with a
Base.@ccallable
entry-point which means the Julia function can be called directly from C. - Create a custom sysimage to reduce latency (this is pretty much just doing part 1) and to hold the C-callable function from the first step.
- Write an embedding wrapper in C that loads our custom sysimage, does some initialization and calls the entry point in the script.
A toy application
To have something concrete to work with we will create a very simple application. Keeping with the spirit of CSV parsing, we will create a small app that parses a list of CSV files given as arguments to the app and prints the size of the parsed result. The code for the app (MyApp.jl
) is shown below:
module MyApp
using CSV
Base.@ccallable function julia_main()::Cint
try
real_main()
catch
Base.invokelatest(Base.display_error, Base.catch_stack())
return 1
end
return 0
end
function real_main()
for file in ARGS
if !isfile(file)
error("could not find file $file")
end
df = CSV.read(file)
println(file, ": ", size(df, 1), "x", size(df, 2))
end
end
if abspath(PROGRAM_FILE) == @__FILE__
real_main()
end
end # module
The function julia_main
has been annotated with Base.@ccallable
which means that a function with the unmangled name will appear in the sysimage. This function is just a small wrapper function that calls out to real_main
which does the actual work. All the code that is executed is put inside a try-catch block since the error will otherwise happen in the C-code where the backtrace is not very good
To facilitate testing, we check if the file was directly executed and in that case, run the main function. We can test (and time) the script on the sample CSV file from the first tutorial
❯ time julia MyApp.jl FL_insurance_sample.csv
FL_insurance_sample.csv: 36634x18
julia MyApp.jl FL_insurance_sample.csv 12.51s user 0.38s system 104% cpu 12.385 total
Create the sysimage
As in the previous tutorial, we do a "sample run" of our app to record what functions end up getting compiled. Here, we simply run the app on the sample CSV file since that should give good "coverage":
julia --startup-file=no --trace-compile=app_precompile.jl MyApp.jl "FL_insurance_sample.csv"
The custom_sysimage.jl
script look similar to before with the exception that we added an include of the app file inside the anonymous module where the precompilation statements are evaluated in:
Base.init_depot_path()
Base.init_load_path()
@eval Module() begin
Base.include(@__MODULE__, "MyApp.jl")
for (pkgid, mod) in Base.loaded_modules
if !(pkgid.name in ("Main", "Core", "Base"))
eval(@__MODULE__, :(const $(Symbol(mod)) = $mod))
end
end
for statement in readlines("app_precompile.jl")
try
Base.include_string(@__MODULE__, statement)
catch
# See julia issue #28808
Core.println("failed to compile statement: ", statement)
end
end
end # module
empty!(LOAD_PATH)
empty!(DEPOT_PATH)
The sysimage is then created as before:
❯ julia --startup-file=no -J"/home/kc/julia/lib/julia/sys.so" --output-o sys.o custom_sysimage.jl
❯ gcc -shared -o sys.so -fPIC -Wl,--whole-archive sys.o -Wl,--no-whole-archive -L"/home/kc/julia/lib" -ljulia
Windows-specific flags
For Windows we need to tell the linker to export all symbols via the flag -Wl,--export-all-symbols
. Otherwise, the linker will fail to find julia_main
when we build the executable.
Creating the executable
Embedding code
The embedding script is the "driver" of the app. It initializes the julia runtime, does some other initialization, calls into our julia_main
and then does some cleanup when it returns. We can borrow a lot for this embedding script from the embedding manual there are however some things we ne ed to set up "manually" that Julia usually does by itself when starting Julia. This includes assigning the PROGRAM_FILE
variable as well as updating Base.ARGS
to contain the correct values. The script MyApp.c
ends up looking like:
// Standard headers
#include <string.h>
#include <stdint.h>
// Julia headers (for initialization and gc commands)
#include "uv.h"
#include "julia.h"
JULIA_DEFINE_FAST_TLS()
// Forward declare C prototype of the C entry point in our application
int julia_main();
int main(int argc, char *argv[])
{
uv_setup_args(argc, argv);
// JULIAC_PROGRAM_LIBNAME defined on command-line for compilation
jl_options.image_file = JULIAC_PROGRAM_LIBNAME;
julia_init(JL_IMAGE_JULIA_HOME);
// Initialize Core.ARGS with the full argv.
jl_set_ARGS(argc, argv);
// Set PROGRAM_FILE to argv[0].
jl_sym_t *var = jl_symbol("PROGRAM_FILE");
jl_value_t *val = jl_cstr_to_string(argv[0]);
#if JULIA_VERSION_MAJOR == 1 && JULIA_VERSION_MINOR >= 10
jl_binding_t *bp = jl_get_binding_wr(jl_base_module, var);
jl_checked_assignment(bp, jl_base_module, var, val);
#elif JULIA_VERSION_MAJOR == 1 && JULIA_VERSION_MINOR >= 9
jl_binding_t *bp = jl_get_binding_wr(jl_base_module, var, 1);
jl_checked_assignment(bp, val);
#else
jl_set_global(jl_base_module, var, val);
#endif
// Set Base.ARGS to `String[ unsafe_string(argv[i]) for i = 1:argc ]`
jl_array_t *ARGS = (jl_array_t*)jl_get_global(jl_base_module, jl_symbol("ARGS"));
jl_array_grow_end(ARGS, argc - 1);
for (int i = 1; i < argc; i++) {
jl_value_t *s = (jl_value_t*)jl_cstr_to_string(argv[i]);
jl_arrayset(ARGS, s, i - 1);
}
// call the work function, and get back a value
int ret = julia_main();
// Cleanup and gracefully exit
jl_atexit_hook(ret);
return ret;
}
Building the executable
We now have all the pieces needed to build the executable; a sysimage and a driver script. It is compiled as:
❯ gcc -DJULIAC_PROGRAM_LIBNAME=\"sys.so\" -o MyApp MyApp.c sys.so -O2 -fPIE \
-I'/home/kc/julia/include/julia' \
-L'/home/kc/julia/lib' \
-ljulia \
-Wl,-rpath,'/home/kc/julia/lib:$ORIGIN'
where we have added an rpath
entry into the executable so that the julia library can be found at runtime as well as the sys.so
library (ORIGIN means to look in the same folder as the binary for shared libraries).
❯ time ./MyApp FL_insurance_sample.csv
FL_insurance_sample.csv: 36634x18
./MyApp FL_insurance_sample.csv 0.19s user 0.09s system 242% cpu 0.115 total
❯ ./MyApp non_existing.csv
ERROR: could not find file non_existing.csv
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] real_main() at /home/kc/MyApp/MyApp.jl:21
[3] julia_main() at /home/kc/MyApp/MyApp.jl:7
macOS considerations
On macOS, instead of $ORIGIN
for the rpath
, use @executable_path
.
Windows considerations
On Windows, it is recommended to increase the size of the stack from the default 1 MB to 8MB which can be done by passing the -Wl,--stack,8388608
flag. Windows doesn't have (at least in an as simple way as Linux and macOS) the concept of rpath
. The goto solution is to either set the PATH
environment variable to the Julia bin
folder or alternatively copy paste all the libraries in the Julia bin
folder so they sit next to the executable.