How to Use the Foreign Function API in Java 22 to Call C Libraries

207 points
1/20/1970
14 days ago
by pjmlp

Comments


thefaux

I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time. You could go meta and bundle a builder in a jar that looks for the shared library in a particular place and shells out to build and install the native library if it is missing on the host computer. This would run once pretty similar I think to bundling native code in npm.

I have bundled shared libraries for five or six platforms in a java library that needs to make syscalls. It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.

The problem with the new api is that people upgrade java very slowly in most contexts. For an oss library developer, I see very little value add in this feature because I'm still stuck for all of my users who are using an older version of java. If I integrate the new ffi api, now I have to support both it and the jni api.

11 days ago

MaxBarraclough

> I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time

There are several, including SWIG.

11 days ago

andoando

Which is still a pita to use unless maybe you really know what youre doing.

11 days ago

TillE

Binding generation is really difficult to approach as a general problem, which is why I've found that SWIG unfortunately doesn't help much in non-trivial cases.

All the good bindings I use are generated by custom systems (ie, usually some Python scripts) tailored for the specific way their library works.

10 days ago

lelanthran

There is SWIG, which does bings to and from C for almost every language that exists.

11 days ago

PaulHoule

Back in 1998 I wrote a code generator to make JNI stubs for LAPACK. It’s the kind of programming that goes that way.

11 days ago

gudzpoz

There is a library called jnigen [1], mainly used by the libGDX framework [2]. But I don't see it used in many other projects though. Personally I use it to maintain a set of Lua C API bindings for some platforms [3] and it works sort of OK once you manage to somehow set up a workflow for building and testing the binaries.

> It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.

It is definitely a pain when you cannot test all changes on a single local machine. But I would argue that it is true whenever multiple platforms (or maybe even multiple glibc versions) are involved, regardless of what languages/libraries/tools you use.

[1] https://libgdx.com/wiki/utils/jnigen [2] https://github.com/libgdx/libgdx [3] https://github.com/gudzpoz/luajava

11 days ago

marginalia_nu

What I'm missing is a model for building/distributing those C libraries with a java application.

Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc, which (with possibly the exception of like madvise and aioring) Java already mostly has decent facilities to interact with even without native calls.

11 days ago

sedro

Native libraries are typically packaged inside a jar so that everything works over the existing build and dependency management systems.

For example, each these jars named "native-$os-$arch.jar" contain a .dll/.so/.dylib: https://repo1.maven.org/maven2/com/aayushatharva/brotli4j/

JNA will extract the appropriate native library (using os.name and os.arch system properties), save the library to a temp file, then load it.

11 days ago

throwaway2037

    > JNA will extract the appropriate native library ..., save the library to a temp file, then load it.
JNA does this?

FYI: JNA = Java Native Access project: https://github.com/java-native-access/jna

11 days ago

okr

Examples of JARs, that transport such libraries: snappy, sqlite...

10 days ago

gwbas1c

> Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc ... Java already mostly has decent facilities to interact with even without native calls.

Because you would use ffi to interact with libraries that don't have Java wrappers yet: IE, you're writing the wrapper.

Using syscalls or libc is a way to write an example against a known library that you're probably familiar with.

11 days ago

pron

The recommended distribution model for Java applications is a jlinked runtime image [1], which supports including native libraries in the image.

[1]: Technically, this is the only distribution model because all Java runtimes as of JDK 9 are created with jlink, including the runtime included in the JDK (which many people use as-is), but I mean a custom runtime packaged with the application.

11 days ago

maksut

Is that still true when distributing libraries?

11 days ago

brabel

Absolutely not. jlink is used to distribute applications (it includes your code, the Java libs you use, i.e. their jars, and the trimmed-down JVM with the modules you're using so that your distribution is not so big - typically around 30MB).

Java libraries are still obtained from Maven repositories via Maven/Gradle/Ant/Bazel/etc.

11 days ago

pron

If you distribute libraries as jmod files, which few libraries do (in that case, jlink would automatically extract the native libraries and place them in the appropriate location).

11 days ago

aardvark179

So, other people have already answered this, but this does seem to be a gap where many developers lack some piece of knowledge to chain the whole solution together. You normally package this sort of thing by putting the native library in a jar, extracting it to a tmp file that will be deleted on exit, and opening that dynamic library.

I’ve met many perfectly reasonable developers who do know all those steps can be done but can’t put them all together - maybe because it just hasn’t clicked that you can store a library in a jar. It feels like something tutorials should cover, but I think falls into the, “surely everyone can work it out?” category.

11 days ago

xxs

>exract it to a tmp file that will be deleted on exit,

actually you delete it immediately (after load) on anything that's not windows... even then but it's likely to return false.

deleteOnExit just stores the path to delete and uses a shutdownHook to actually call delete. Nothing really special about it

11 days ago

sedro

You would also need to learn about Maven profiles and activation. And for other build tools, you'll be delighted to know they have partial support.

11 days ago

chii

> extracting it to a tmp file

i wonder if there's a way to do this entirely in memory? Because some deployment scenarios might not have disk space at all.

11 days ago

rwmj

Technically memfd_create will let you create a file descriptor backed by a memory region. However in Linux I don't believe there's a way to dlopen that. (Maybe dlopen /dev/fd/... might work?) In FreeBSD there's a fdlopen library function.

Edit: glibc proposal which was never accepted: https://sourceware.org/bugzilla/show_bug.cgi?id=11767

10 days ago

MobiusHorizons

/tmp is often a RAM disk in such cases

11 days ago

xxs

not with java - it needs a path. Of course if you have a ram disk (e.g. /tmp) it'd do the job.

11 days ago

mike_hearn

If your app is open source, or you're willing to buy a commercial tool, then you could try Conveyor from my company [1]. It will:

- Find all the shared libraries in your JARs or configured app inputs (files in your build/source tree)

- Sniff them to figure out what OS and CPU arch they are for

- Bundle them into the right package for each platform that it makes, in the right place to be found by System.loadLibrary()

- Sign them if necessary

- Delete them from the JARs now they are extracted. Optionally extract them from library JARs, sign them and then put them back if your library refuses to load the shared library from disk instead of unpacking it (most libs don't need this)

- JLink a bundled JVM for your app for each platform you target, using jdeps to figure out the right set of modules, and combine that with your shared libs.

When building Debian/Ubuntu packages it will also:

- Read the .so library dependencies, look up the packages that contain those other shared libraries and add package dependencies on those packages, so "apt install" will do the right thing.

So that makes it a lot easier to distribute Java apps that use native code.

[1] https://www.hydraulic.dev/

10 days ago

pjmlp

You do it the standard way, package them inside the jar file.

11 days ago

marginalia_nu

Oh, does this actually work?

I was on the assumption that it was dynamically linking the libarary with the OS dynamic linker, which in no OS I'm aware of is capable of loading libraries inside of zip files.

Not sure where I got that notion. Maybe I was overthinking this.

11 days ago

zten

Yes. Check out a library like zstd-jni. You'll find native libraries inside it. It'll load from the classpath first, and then ask the OS linker to find it.

11 days ago

maksut

I'd like to learn how they do it. Because last time I've looked at this, the suggested solution was to copy the binaries from claspath (eg: the jar) into a temporary folder then load it from there. It feels icky :)

11 days ago

zten

Yep, you're right, they do exactly that. Apologies for the confusion.

Decompiled class file:

    try {
        var4 = File.createTempFile("libzstd-jni-1.5.0-4", "." + libExtension(), var0);
        var4.deleteOnExit();
11 days ago

sedro

This wouldn't work on Windows, because you can't delete a DLL while it's in use

11 days ago

electrum

You might be able to use FILE_FLAG_DELETE_ON_CLOSE, but this would likely require calling the Windows API functions directly.

11 days ago

BenjiWiebe

Couldn't you: Extract DLL Load DLL Unload DLL Delete DLL ?

Though in the example given, I do see your point now. You'd have to make sure the DLL was unloaded before the delete-on-exit happened.

11 days ago

sedro

According to JNA it's not safe to unload the DLL:

https://github.com/java-native-access/jna/blob/40f0a1249b5ad...

  Do NOT force the class loader to unload the native library, since
  that introduces issues with cleaning up any extant JNA bits
  (e.g. Memory) which may still need use of the library before shutdown.
Following the blame back to 2011, they did unload DLLs before https://github.com/java-native-access/jna/commit/71de662675b...

  Remove any automatically unpacked native library.  Forcing the class
  loader to unload it first is only required on Windows, since the
  temporary native library is still "in use" and can't be deleted until
  the native library is removed from its class loader.  Any deferred
  execution we might install at this point would prevent the Native
  class and its class loader from being GC'd, so we instead force 
  the native library unload just a little bit prematurely.
Users reported occasional access violation errors during shutdown.
11 days ago

tadfisher

You can install a shutdown hook to do cleanup like this.

    Runtime.getRuntime().addShutdownHook(...)
11 days ago

sedro

That's how java.io.File#deleteOnExit works under the hood. The DLL is still loaded at that point and can't be deleted.

11 days ago

tadfisher

Ah, looking through the docs [1]; you have to use your own ClassLoader (so it can be garbage-collected), and statically-link with a JNI library which is unloaded when the ClassLoader is garbage-collected.

1: https://docs.oracle.com/en/java/javase/22/docs/specs/jni/inv...

10 days ago

zten

Hmm, interesting. They do have DLLs in the JAR...

10 days ago

renewiltord

EDIT: Disregard. I am wrong. Original below.

You can just load as a resource. We do this internally since much of network stack is C. But we use JNI because code is older than Java 22.

11 days ago

maksut

You made me search it again. And still I don't see how that's possible. `Runtime.load` requires a regular file with an absolute path[0].

Stackoverflow is full of "copy it into a temp file" solutions. ChatGPT keeps saying "sorry" but still insists on copying it into a temp file :)

[0] - https://docs.oracle.com/en%2Fjava%2Fjavase%2F22%2Fdocs%2Fapi...

11 days ago

renewiltord

Embarrassing of me to give you wrong answer. I went and checked my old code and:

     new FileOutputStream(tmpFile)
Apologies.
11 days ago

marginalia_nu

Sounds promising.

I have some extremely unwieldy off-heap operations currently implemented in Java (like quicksort for 128 bit records) that would be very nice to offload as FFI calls to the corresponding a single-line C++ function.

11 days ago

neonsunset

Why not give C# a try instead? It has everything you ask for and then some.

11 days ago

coldtea

Because "some inconvenience/unmet requirement" from a language is not an invitation to "throw out the whole platform and your existing code and tooling, and learn/adopt/use an entirely different, single-vendor platform".

Except if we're talking about some college student or hobbyist picking their first language and exploring the language space...

11 days ago

imtringued

He would still have to call out to the C++ function.

10 days ago

neonsunset

Assuming it is "sort for 128bit records", that's something C# does really well - writing optimized code with structs / Vector128<T> / pointer arithmetic when really needed without going through FFI and having to maintain separate build step and project parts for a different platform.

But even if it was needed, such records can be commonly represented by the same structs both at C#'s and C++'s sides without overhead.

An array of such could be passed as is as a pointer, or vice versa - a buffer of struts allocated in C/C++ can be wrapped in a Span<Record128> and transparently interact with the rest of standard library without having to touch unsafe (aside from eventually freeing it, should that be necessary).

10 days ago

neonsunset

Wow, you all are sure mad enough to go out of your way and downvote my comments elsewhere.

Stay in the swamp :)

11 days ago

brabel

I remember using Sqlite Java and not having to install sqlite on the image. Then I looked inside the Sqlite-java's jar and they just packed the sqlite binaries for the different OSs in the jar!!

11 days ago

dehrmann

Not sure if this is still the case, but one of the Java Sqlite driver used something called NestedVM to run Sqlite in the JVM when a native library wasn't available. It worked by cross-compiling the code to mips, then transpiling the mips assembly to Java byte code. I can't remember if it bridged system calls or libc calls to Java for things like file IO.

11 days ago

DannyB2

I once (2016 ish) used a serial-port library for Java. Needed to be cross platform desktop app for Linux, Windows and Mac (in that order, all on x86/64). And it was. I have forgotten the name of the library project I included, but it included DLL binaries for the platforms we were targeting.

11 days ago

KptMarchewa

That's a common solution. I do the same.

11 days ago

saagarjha

Android knows how to do this, actually.

11 days ago

fire_lake

Is there a solution when the binaries are 500mb+ per platform?

11 days ago

pjmlp

People seem pretty happy when Go and Rust do the same with static linking, advocating how great it happens to be.

11 days ago

yw3410

You can't be as aggressive at removing functions in Java than in Rust though since it's dynamic dispatch (e.g., if you use toString once in your code, you need to keep all implementations of toString which are reachable even if users don't use reflection).

11 days ago

neonsunset

.NET's trimmer/linker deals with this quite well, only referenced or otherwise observable .ToString() implementations are rooted.

Without it 1.6-2MiB-sized AOT binaries would not have been possible (most space is occupied by standard library/runtime bits and GC)

11 days ago

pjmlp

Except that is what jlinker, and GraalVM/OpenJ9 (among other AOT toolchains) do in practice.

10 days ago

fire_lake

In Java libraries are shared precompiled so the package manager either needs to be platform aware or distribute fat bundles.

11 days ago

pjmlp

Only for those that are yet to learn how to use jlinker.

10 days ago

cesarb

Static linking in Go and Rust includes compiled code only for the target platform. It does not include compiled code for every possible architecture, including 32-bit MacOS and Solaris on PowerPC.

11 days ago

SJC_Hacker

Solaris... now thats a name I have not heard in a long time. A long time.

11 days ago

pjmlp

Hence why you end up with 300MB x platforms, and tricks like upx.

10 days ago

mike_hearn

Libraries like JCEF have support tools to download the libraries either at runtime or during the build, to offload from Maven Central.

11 days ago

ruslan_talpa

Put them in a jar?

11 days ago

stevefan1999

Compared to .NET's P/Invoke this is still way too convoluted. Of course Java has its own domain problem such as treating everything as a reference (and thus pointer, there is a reason Java has NullPointerException rather than NullReferenceException) and the lack of stack value types (everything lives on heap unless escape analysis allows some data to stay on stack, but it is uncontrollable anyway) makes translation of Plain-Old-Data (POD) types in Java very difficult, which is mostly a no-op with C#. That's why JNI exists as a mediator between native code and Java VM.

In C# I can just do something like this conceptual code:

```

// FILE *fopen(const char *filename, const char *mode)

[DllImport("libc")] public unsafe extern nint fopen([MarshalAs(UnmanagedType.LPStr)] string filename, [MarshalAs(UnmanagedType.LPStr)] string mode);

// char *fgets(char *str, int n, FILE *stream)

[DllImport("libc")] public unsafe extern nint fgets([MarshalAs(UnmanagedType.LPStr)] string str, int n, nint stream);

// int fclose(FILE *stream)

[DllImport("libc")] public unsafe extern int fclose(nint stream);

```

So much less code, and so much more precise than any of the Java JNI and FFI stuff.

11 days ago

the-alchemist

Java's FFI is currently a very low-level. As the article points you, you don't actually have to do this: the jextract tool will generate the bindings for you from header files.

I'm sure someone will come along and write annotations to do exactly as you describe there. The Java language folks tend to be very conservative about putting stuff in the official API, cuz they know it'll have to stay there for 30+ years. They prefer to let the community write something like annotations over low-level APIs.

Anyway, the GraalVM folks don't have quite the same limitations as Java, so they have annotations already (https://yyhh.org/blog/2021/02/writing-c-code-in-javaclojure-...):

    @CStruct("MDB_val")
    public interface MDB_val extends PointerBase {

        @CField("mv_size")
        long get_mv_size();

        @CField("mv_size")
        void set_mv_size(long value);

        @CField("mv_data")
        VoidPointer get_mv_data();

        @CField("mv_data")
        void set_mv_data(VoidPointer value);
    }
10 days ago

pjmlp

I wasn't aware of it, great! One more point to the GraalVM folks.

10 days ago

neonsunset

Can be even simpler now (you can declare it as a local function in a method, so this works when copied to Program.cs as is):

    var text = "Hello, World!"u8;
    write(1, text, text.Length);

    [DllImport("libc")]
    static extern nint write(nint fd, ReadOnlySpan<byte> buf, nint count);
(note: it's recommended to use [LibraryImport] instead for p/invoke declarations that require marshalling as it does not require JIT/runtime marshalling but just generates (better) p/invoke stub at build time)
11 days ago

pjmlp

Yep, that is my main complaint, and why I will rather reach to JNI instead.

10 days ago

xyproto

Does this mean that one can use SDL2 together with Java without bending over backwards?

11 days ago

maksut

I have played with raylib bindings for clojure by using the new foreign function api. It was a lot of fun. SDL might be a better fit because it prefers pass by reference arguments [0].

[0] https://gist.github.com/raysan5/17392498d40e2cb281f5d09c0a4b...

11 days ago

neonsunset

It seems it will make it somewhat easier.

But if you want to use SDL2 from something higher-level, you will be much better served by C# which will give you minimal FFI cost and most data structures you want to express in C as-is.

11 days ago

maksut

I don't know much about C#. It certainly looks more popular in gamedev circles.

When I played with this new java api. I wasn't worried about the FFI cost. It seemed fast enough to me. My toy application was performing about 0.77x of pure C equivalent. I think Java's memory model and heavy heap use might hurt more. Hopefully Java will catch up when it gets value objects with Project Valhalla. Next decade or so :)

11 days ago

neonsunset

Genuine curiosity - what would be your motivation to use Java over C# here aside from familiarity (which is perfectly understandable)? The latter takes heavy focus on making sure to provide features like structs and pointers with little to no friction, you can even AOT compile it and statically link SDL2 into a single executable.

In improbable case you may want to try it out, then all it needs is

- SDK from https://dot.net/download (or package manager of your choice if you are on Linux e.g. `sudo apt-get install dotnet-sdk-8.0`, !do not! use Homebrew if you are on macOS however, use .pkg installer)

- C# extension for VS Code (DevKit is not needed)

- SDL2 abstraction: https://github.com/dotnet/Silk.NET (there are all sorts of alternate bindings depending on your preferences)

11 days ago

stoperaticless

Not the op, but at some point I did choose between the two paths/jobs assuming I will get more proficient in only one of them each year (which is true, I stayed junior in C#).

Why I chose Java boils down to two reasons:

- runs on linux (I know there is some version of c# that eventually opened up, but I kind of expect it to have lot of conditions for being cross platform, I assume that standard c# code is not crossplatform due to some reason (e.g. Com usage might be standard way of doing stuff), which would make finding crossplatform answers tedious)

- whole ecosystem is more open source and more involved parties (which I interpreted as abit less controlled by the corporate overlord, so if corporate overlord went rogue, greater chance that language would survive somehow)

Never needed to call into C though..

11 days ago

5e92cb50239222b

Despite what some fanatics may claim, operating systems other than Windows are still second class citizens (saying this after five years of doing .NET development almost exclusively on Linux), especially for dev, and operating systems other than the big three are not supported at all. So no BSDs (even FreeBSD) or Solaris if you ever need it.

Since the open .NET is pretty young, and they still have trouble with community perception due to their past actions, finding high quality FOSS libraries may pose a problem depending on what you're doing. Pretty much everything from MS is open and high quality, but they don't provide everything under the sun.

And with Java you always have alternative runtimes in case this Oracle deal goes sideways for any reason.

So you're all good, don't worry about it.

11 days ago

neonsunset

FreeBSD: `pkg install lang/dotnet` (from https://www.freshports.org/lang/dotnet)

GObject (GTK4 and similar): https://github.com/gircore/gir.core (significantly better and faster than Java alternatives, this is just one example among many)

Young: first OSS version was released 8 years ago

Solaris: might as well say "it runs COBOL but not .NET"

It's funny that everyone missed the initial context of the question and jumped onto parroting the same arguments as years ago, without actually addressing the matter at hand or saying anything of substance. Unsurprising show of ignorance by Java community. Please stay this way - will help the industry move on faster.

The premise is always the same - if something is missing in {technology I don't like}, it's a deal-breaker, and when it's not or was always there - it never mattered, or is harmful actually, that is, until {technology I like} gets it as well.

11 days ago

pooya72

Interesting, in what ways is it a second class citizen? I tried googling it, but didn't find much.

11 days ago

neonsunset

Neither point is true today FWIW.

Neither point was ever true in the last ~10 years when it comes to gamedev (or where you want to use SDL) where Java was and continues to be a much weaker choice.

11 days ago

kaba0

Java’s ecosystem is just vastly bigger. In many categories, Java has multiple open-source offerings vs .NET’s single, proprietary one that is often just a bad copy of one of the former libraries.

11 days ago

spullara

Java now has <1ms max pause time garbage collectors with TB heaps. If you are writing a game GC matters a lot.

11 days ago

neonsunset

Interesting! What games did you write code for?

11 days ago

maksut

It was a learning exercise. Just playing around with clojure, raylib and this new api. I know all these can also be done with C# with some pros & cons.

I wasn't advocating java for gamedev. Just pointing that, this new api is a nice addition. And I am glad that jvm ecosystem is improving.

To be fair, if I was starting a game project I wouldn't stay in Java/C# level. Depending on the project, something like C, C++, zig might be more practical. Ironically I believe they would be easier for iterating ideas and deploy into different platforms (mobile, wasm etc.).

10 days ago

neonsunset

A little bit sad* but understandable, thank you.

*C/C++ tooling and verbosity pains, what PL dev progress is for? C# feels more modern than some of the "modern" alternatives but eh.

10 days ago

lazide

Not the original poster, but most folks have little choice what ecosystem they’re using.

And once you have enough momentum, switching isn’t usually worth it.

(As someone who has done Perl, C, Java, C#, Kotlin, JS, and Python professionally - god help me. Maybe a million lines of code all in now?)

11 days ago

neonsunset

Fair enough

11 days ago

p0w3n3d

You're are (or were) right. Java has (had) an awful performance of a foreign API call, and I wonder was this fixed in this release, because as I heard, fixing it was the main reason of the upcoming functionality

11 days ago

neonsunset

It could bring Java closer in FFI overhead but not necessarily match. There are still missing features like structs, C pointers (though in C# they are superseded quite a bit by byrefs aka `ref T` syntax, e.g. used by Span<T>), stack allocated buffers, etc.

C# also has function pointers (managed/unmanaged) and C exports with NativeAOT.

11 days ago

marginalia_nu

Oh man that's a cool idea.

Might just build a SDL2-wrapper for ffi just as an FFI and FMI-exercise.

11 days ago

alex_suzuki

Wonder if this will make JNA (Java Native Access) redundant at some point: https://github.com/java-native-access/jna

Very useful, especially the prebundled platform bindings.

11 days ago

iso8859-1

Calling C is easy. But how do you call C++? Shiboken has a language that let's you express ownership properties on C++ data structures/methods/functions. It's tailored to generating Python FFI bindings though. It would be so nice if there were a cross-platform language to do this.

11 days ago

qweqwe14

The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI, which is relatively basic, but this is exactly why it's most suitable for FFI: implementing support for calling C functions is way more trivial than figuring out how to call the latest C++/Rust/etc monstrosity.

11 days ago

zozbot234

> The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI

The Swift folks have put a lot of effort into attaining a stable ABI that's native to their language. They can achieve that because Swift is the officially endorsed language for development on Mac OS and iOS, so it (together with the platform itself) can set a standard that other languages will have to live with.

In a way, software VM's like the JVM and CLR can also be said to define 'ABIs' of sorts within their runtime, that every language implementation on these runtimes will have to deal with.

11 days ago

Dwedit

There do exist ABIs that aren't the C ABI. But saying "use the C ABI" is far more portable than anything else.

I can also point to the GCC Inline Assembler as an excellent way to call arbitrary functions whether they implement the standard C procedure call standard or not. By providing the list of arguments and what register they correspond to, along with the clobber list, you know everything you need to know to call the function. So it's more suitable for "fastcall" type functions where you need the arguments to correspond to particular registers.

But of course, ASM isn't portable.

11 days ago

neonsunset

Swift ABI proves this to be wrong, but also showcases the complexity that goes with ABI of such kind.

11 days ago

secondcoming

You put your C++ behind a C API.

11 days ago

p0w3n3d

This is something new. Before it you had to create a native-compatible shared library that returns jString/jObject instead or use a proxy which did this for you (JNA). Let's see what happens next, maybe even shiboken

11 days ago

imtringued

I don't know why people don't know this, but you can just use GObject.

10 days ago

mike_hearn

There's javacpp which can do that.

11 days ago

creativeSlumber

Not directly related to the artcile,but is there any article that explain how memory management (stack/heap) work when using FFI in java. Also when a call is made though FFI to a C library, is there a separate java and C call stack? I haven't found a good article yet on what happens under the hood.

11 days ago

w10-1

For the heap, JEP 454 is reasonably detailed: https://openjdk.org/jeps/454

It describes how to adopt memory from C and have C adopt memory you allocate, and gives control over how memory is allocated in an arena.

The arena has lifecycle boundaries, and allocations determine the memory space available. Java guarantees (only) that you can't use unallocated memory or memory outside the arena, and if you access via a (correct) value layout, you should be able to navigate structure correctly.

The interesting stuff is passing function pointers back and forth - look for `downcall method handles`.

11 days ago

dzaima

Don't have an article, but the gist on stacks is that Java still uses the regular architecture stack (rsp on x86, etc) that the FFI'd code will, and on exit to/entry from FFI it'd have to store its stack end/start pointer (or otherwise be able to figure the range out) such that GC knows what to scan.

11 days ago

kgeist

I wonder how it works when you use virtual threads. In Go, goroutines have resizable stacks which notoriously complicates FFI because C has no idea about resizable stacks (IIRC they have to temporarily switch to a separate, special C stack).

11 days ago

mike_hearn

When it's running a virtual thread is using a physical OS level thread, and if you call into C then that virtual thread won't suspend. It pins the OS thread. So it's all transparent.

11 days ago

jakjak123

I had a C library I needed to ideally use from Java directly. The new FFI API looks great, but unfortunately the C API relied heavily on macros and void* arguments, making it incredibly difficult to model from Java.

11 days ago

the-alchemist

I would give the jextract tool a try. I believe it uses LLVM to parse the header files, so the generated bindings might actually be pretty good.

10 days ago

jakjak123

I did use jextract for java 19+20, but it looked very messy.

Tried it again yesterday on java 22, and the helpers from jextract are waaaay better. I actually completed a MVP implementation this time in a couple of hours. This could perhaps be released as library if I find the effort to wrap it in a meaningful way!

We currently wrap this in java by calling the binary with subprocesses, which has been working great at some latency overhead. The big bonus of this though, is that we can kill the process from java when it misbehaves. Putting this C code inside Java again, means we likely lose that control.

7 days ago

petesergeant

I can’t see why I’d ever reach for it, but I do like knowing that Java is actively being improved over time

11 days ago

[deleted]
11 days ago

xyst

What’s the use case here? Developing drivers with Java?

11 days ago

invalidname

Invoking native code has always been necessary in Java. In the past it was done via JNI which has many issues. These new APIs solve the issues and simplify the API. The use case is interacting with anything that isn't written in Java.

11 days ago

xtracto

Blast from the past! I remember doing JNI integration in Java around 2003! It's been so long I don't remember details but you had to declare some interfaces in java, then some middleware .h or .c and then call the native library iirc.

Glad to see things are progressing!!

11 days ago

neonsunset

Same use case as to why .NET has low/zero-cost FFI.

This is similar, except more boilerplate and much, much slower.

11 days ago

pron

The FFM downcalls in OpenJDK compile down to argument shuffling + a CALL instruction (in "critical" linker mode), i.e. the same machine code gcc/clang would generate for a call from a C program.

11 days ago

neonsunset

This is what it is compiled to in .NET[0] today more or less[1]. What does OpenJDK compile these to? (edit: misread as could compile. Hmm, I wonder how much the difference will there be in average FFI cost with newer APIs vs direct calls)

[0] Objects that need pinning are pinned(by toggling a bit in object header), byrefs are pinned by simply storing them on the stack, arguments that need marshalling involve calling corresponding marshalling code. That code can allocate intermediate data on heap, on stack or call NativeMemory.Alloc/.Free C-style.

[1] Overhead can be further reduced by 1. annotating FFI calls with [SuppressGCTransition] which saves on possible arguments stack spills and GC helper call, replacing the call with a single flag check and optional call into GC in epilog, 2. in NativeAOT, p/invokes can be "direct" which saves on initialization checks and indirections (though they are reduced in JIT as it can bake data directly into codegen after static init has finished on recompilation). This has a tradeoff as system's dynamic loader will be used at application startup instead of regular lazy initialization and 3. direct p/invokes can be upgraded to static linking, which transforms them into direct calls identical to regular C calls save for the same GC flag check in post-condition. This comes with compiling .NET executables and libraries into a single statically linked binary (well, statically linked for the native dependencies the user has opted into linking this way).

11 days ago

int_19h

What does it do if you need to pass a struct that contains another struct?

11 days ago

pron

The same, but you need to define the layout appropriately. The JEP covers the basics: https://openjdk.org/jeps/454. As I explained in another comment [1], we didn't want to trade off performance or limit the runtime, so the API for describing native layouts is more elaborate.

[1]: https://news.ycombinator.com/item?id=40303242

11 days ago

pjmlp

I still have some hopes that it will evolve towards a P/Invoke like experience.

While a step closer to Valhala, the whole dev experience is still quite lacking versus what .NET offers.

Currently is too much like making direct use of InteropServices.

11 days ago

pron

> I still have some hopes that it will evolve towards a P/Invoke like experience.

Doubtful, given that this is something we worked hard to avoid. To be efficient, a P/Invoke-like model places restrictions on the runtime, which inhibits optimisation and flexibility and this cost is worth it only when native calls are relatively common. In Java they are rare and easily abstracted away, so we opted for a model that offers full control without giving up on abstraction, given that only a very small number of experts (<1%) would directly write native calls and then hide them as implementation details. I'm not saying this approach is the right one for all languages, but it's clearly the right one for Java given the frequency of native calls and who makes them.

Of course, you can wrap FFM with a higher-level P/Invoke-like mechanism, but it won't give you as much control.

11 days ago

pjmlp

Well, for developers like myself that feel at home with JNI, the current development experience, even with jpackage, is too much to ask for.

I will rather keep writing C++ with JNI, instead of enduring the current boilerplate, specially if I already need to manually create header files to feed into jpackage, for basic stuff like struct definitions, which I don't feel like writing by hand.

As for performance, this is something I agree with neonsunset, unless we see Techpowerbenchmarks level of Panama beating P/Invoke, it is pretty much theoretical stuff at the expense of developer convience.

11 days ago

pron

We can't tailor every feature to the widely disparate preferences of so many developers nor do we try to convince every last developer of the merit of our approach -- this is both impractical and a losing strategy. Rather, we rely on our experience designing a highly successful language and platform, and consult with companies -- each employing thousands of Java developers -- and authors of some of the most popular relevant Java libraries to ensure that we meet their requirements. Of course, we also look at what other languages have done and the tradeoffs they've accepted (some of which may not be appropriate for Java [1]), but there are always many possible designs and we don't adopt one from a less successful language just because it, too, has its fans.

I would encourage those who think that we're consistently making suboptimal choices for Java compared to choices made by significantly less successful languages to consider whether it is possible that their preferences are not aligned with those of the software market at large. Java is and aims to continue being the world's most popular language for serious server software, and that requires tailoring designs to a very large audience.

I always notice a certain lack of respect on forums such as HN for the world's most consistently successful and popular languages -- JS, Java, and Python. Different programmers have different preferences and I'm all for rooting for the underdog now and again, but you simply cannot consistently make wrong decisions over a very long period of time and yet consistently win. What we do may not be everyone's cup of tea (no language is), but it is clearly that of a whole lot of people. We work to offer value to them.

[1]: E.g. the design of native interop has significantly impacted that of user-mode threads (or lack thereof: https://github.com/dotnet/runtimelab/issues/2398) in both .NET and Go, and we weren't willing to make such tradeoffs in either performance or programming model.

11 days ago

pjmlp

I can say that in my bubble we reach out for Java, because of Spring, AEM and Android.

That is it, other use cases, have other programming stacks.

As such our native libraries are written in consideration to be consumed at very least, across .NET (P/Invoke, C++/CLI, COM), Java (JNI), nodejs (C++ addons), Swift.

So to move the existing development workflow from JNI to Panama, it must be an easy sell why we should budget rewrites to start with.

Also in regards to "hate", if all decisions were that great there wouldn't be needed to create a new library support group to help Java ecosystem actually move forward and adopt new Java versions, as I learned from JFokus related content.

11 days ago

pron

You shouldn't! We're not trying to "sell" any rewrite from JNI to FFM. Since FFM is both significantly easier to use and offers better performance, most people would choose to write new interop code with FFM; that is an easy sell. But that's not to say that these benefits justify a rewrite of existing code, and we have no plan to remove JNI. JNI and FFM can coexist in same program (and even in the same class). However, we are about to place the same protections on JNI as those we have on FFM to ensure that Java programs are free of undefined behaviour by default, and that modules that may introduce undefined behaviour are clearly acknowledged by the application so that the application owners may give them closer scrutiny if they wish [1].

To elaborate just a bit more on what I wrote in my previous comment, to get a straightforward interop with C you need to place certain restrictions on the runtime which limit your ability to implement certain abstractions such as moving GCs and user-mode threads. Because native interop requires special care anyway due to native memory management, which makes it significantly more complex than ordinary code and so less suitable for direct exposure to application developers -- so it's best done by experts in the area -- and on top of that native calls in Java aren't common, we decided not to sacrifice the runtime in favour of more direct interop. As a result, native interop is somewhat more elaborate to code, but as it requires some special expertise and so should be hidden away from application developers anyway, we decided it's better to place the extra burden on the experts doing the interop rather than trade off runtime capabilities and performance. We think this is the better tradeoff for Java. Consequently, we have both compacting collectors and no performance penalty for native calls on virtual threads. Other languages made whatever tradeoffs they thought were right for them, but they did very clearly sacrifice something.

[1]: https://openjdk.org/jeps/472

11 days ago

int_19h

A "certain lack of respect" comes from having to work with these languages for literally decades, and knowing their warts (and how those warts compare to some other similarly popular languages).

In general, being successful and popular had little to do with how well a PL is designed. Visual Basic, PHP, and even C are some historical examples that I have plenty of personal experience with.

11 days ago

pron

> In general, being successful and popular had little to do with how well a PL is designed.

Perhaps, but it is fairly easy to design a product for a small, self-selecting group of fans who find the aesthetics appealing and so declare the design good for their taste. Unless a language becomes heavily used in codebases that are maintained for years by a large variety of programmers, it's hard to tell how well it is actually designed as a mass-appeal product.

Two of the three languages you mentioned weren't able to attain nearly the same success as Java for as long a duration. I'd give C a similar success score because what it lacks in popularity it still makes up for in longevity, being almost twice as old. There are good reasons for why C is still as popular as it is. For example, in its domain -- which requires compilation to exotic architectures -- "good design" entails being able to easily implement efficient compilers.

11 days ago

int_19h

Of course, when you compare languages, you have to compare them to contemporary ones that also target the same niche. In case of C, that would be e.g. Modula-2. The consequences of the industry making an expedient but wrong choice then - 45 years ago! - are still with us: C++ only just got proper modules, and even then most C++ code written today is still mostly using #include...

And to be clear, I'm not advocating for aesthetics here. It's not like C# is a model of purity, either; but I would say that their choices over the years have been more pragmatic overall from the perspective of someone who needs to write readable, good quality code without jumping through too many hoops or getting lost in the verbiage.

11 days ago

kaba0

> C# [..] but I would say that their choices over the years have been more pragmatic overall from the perspective of someone who needs to write readable, good quality code without jumping through too many hoops or getting lost in the verbiage.

I personally don’t agree with that, C# is very “impulsive” at adding new features, which sounds cool in isolation, but makes the language significantly more complex to understand, and has non-intuitive interactions with other features.

I think C# is quick at going the C++ way, and there is no return from there if we guarantee compatibility.

I much prefer Java’s approach, where yeah, at times one might lack some syntactic sugar/nicety (often greatly overcome by IDE/tooling’s advancements), but over time they do add important ones, but only commit to features that have been earnestly tried and sustainable.

11 days ago

pron

> I think C# is quick at going the C++ way, and there is no return from there

To be fair to Microsoft, they have favoured rich, complex languages for a long time now. They were probably the biggest champions of C++ and TypeScript also doesn't seem to be going down a particularly minimalistic route (an understatement). They're fans of rich languages, and while such languages are not my cup of tea, they do have a large audience (although I think it's a large minority audience).

8 days ago

pron

Modula-2 wasn't really a contemporary of C's. By the time it was released, C had already taken over the world. Plus, it's yet another case of something that looks good but has never really been tested. While not quite Modula-2, in the early oughts I was working on a large project that was half written in C++ and half in Ada. We're talking millions of lines of code in both languages here. The Ada code looked nice but we were cursing when we had to work with it for two reasons: we had to consult thick Ada manuals to grapple with language intricacies, and compilation times were frustratingly slow. With C++ we could spend more time thinking about the algorithms as there was less "language lawyering", and we could run more tests (ironically, C++ now suffers from both of these problems). Perhaps that's why to this day I prefer smaller languages with short compilation times (I like Clojure but dislike Scala; I like Zig but dislike Rust).

My point is that when people say that one language is technically superior to another, what they really mean is that it's superior in the technical aspects that they themselves value more than the aspects where the other language is technically superior. This is all fine, except that these personal preferences aren't distributed equally. This is a little like the Betamax vs. VHS debate. Sure, Betamax had a superior picture quality that some valued, but VHS had a superior recording time, which others valued but that latter group was bigger.

As for C# -- strong disagree there. I think they're making the classic mistake of trying to solve every problem in the language and soon, resulting in a pretty haphazard collection of features, quite a few of them are anti-features, making up a pretty complicated language. For example, they have both properties and records, while in Java we figured that by adding records we'll both direct people toward a more data-oriented form of programming and at the same time make the problem of writing setters so much less annoying to the point it shouldn't be addressed by the language (while properties have the opposite effect of encouraging mutation). They've painted themselves into a very tight corner with async/await (the same with P/Invoke, which constrained their runtime's design making it harder to add user-mode threads), and I think they've made a big security mistake with interpolation -- something we're trying to avoid with a substantially different design. Also, while richer languages do have a lot of fans, all else being equal more people seem to prefer languages with fewer features. Our hypothesis is that it's better to spend a few years thinking how to avoid adding a language feature (like async/await or properties) than to spend a few months adding it.

Also, every feature you add constrains you a little in the future (and every language makes this tradeoff early when it's trying to acquire users, but once it's established you need to be more careful). That's why we try to keep the abstraction level high at the expense of a quicker and tighter fit to a particular environment. This delays some things, but we believe it keeps us more flexible to adapt to future changes. It's like having an adaptation budget that you don't want to fully spend on your current environment (I think P/Invoke and properties are such examples of overfitting that pays well in the short term and make you less adaptable in the long term). The complexity budget is another you want to conserve. Add a language feature to make every problem easier, and over time you find yourself not only constrained, but with a pretty complex language that few want to learn.

11 days ago

neonsunset

Async/await is not a tight corner as showcased by a multitude of languages adopting the pattern: Rust, Python, JavaScript and Swift. It is a clean abstraction where future progress is possible while retaining the convenience of its concurrency syntax and task composition.

Green threads experiment proved net negative in terms of benefit but the follow-up work on modernizing the implementation details of async/await itself was very successful:

Issue https://github.com/dotnet/runtime/issues/94620

Technical details https://github.com/dotnet/runtimelab/blob/feature/async2-exp...

The result is such that regardless of p/invoke existence green threads would have been a worse tradeoff.

It also seems that common practices in Java indicate that properties are not a mistake as showcased by popularity of Lombok and dozens of other libraries to generate builders and property-like methods (or, worse, Java developers having to write them by hand). In addition, properties existed in C# since its inception, that's...not a few years.

Not entirely sure about string interpolation but if you are alluding to `var text = $"Time: {DateTime.Now}";`, then it's a non-issue - APIs that care about it in complex contexts like querying a DB or logging can handle it with interpolated string handler API which allows to pass string interpolation expression to methods accepting interpolated string handler types, which can then, for example, generate parametrized query with sanitized inputs, without any friction for the user. Something that Java does not seem to sufficiently appreciate.

Example: https://learn.microsoft.com/en-us/ef/core/querying/sql-queri...

11 days ago

jacques_chester

> It is a clean abstraction

Ah, that must be why I see FooMethod and FooMethodAsync side-by-side in C# all the time.

10 days ago

pron

These are all valid and well-known opinions, but that's my point: there is nothing even remotely close to a consensus on them (never mind that even results don't extrapolate well from one language to another), and different choices appeal to different people.

We put a lot of thought into which features we want to add to the Java platform and in what form, and also consider what other languages have done. Sometimes we choose to make different tradeoffs based on what we think are the right tradeoffs for most Java users (a tradeoff that's right for language X may be wrong for language Y [1]), and sometimes we disagree on aesthetics or technical merit. But the choices we've made have worked well for Java. We're well aware of differing opinions, but it seems that we're managing to align with the majority opinions (don't confuse "popular" with "majority"; something like Lombok is quite popular in absolute terms, but is still liked by a minority, i.e. it is less popular than not using it; Kotlin is also quite popular, but it is still more than ten times less popular than Java so does that mean we should follow its decisions?). At the adoption levels enjoyed by JS, Python, and Java, something could be hugely popular in absolute terms yet liked by a minority.

In our primary domain of serious server-side software, no other language has done better (or as well), and we and our users are happy, for the most part, with the choices we've made (except maybe for choices made very early on, but that's true for all languages). The mere fact that sometimes not everyone agrees with our choices (let's be honest, programmers rarely agree on anything) doesn't mean we should change them, especially as languages that go a different way don't seem to be doing as well. Still, different programmers will continue liking different things, and most will continue insisting that their preferences -- however popular -- are somehow "objectively" better with or without bottom-line metrics to support their beliefs.

In general, thinking about a programming language from the perspective of a programmer situated in specific circumstances can be quite different from thinking about a programming language from the perspective of the language maintainer, who needs to take into account different and often conflicting needs of many programmers situated in a variety of different circumstances. The wider the market you're targeting, the more aspects there are to consider and the closer attention needs to be paid to the distribution of programmer preferences.

[1]: E.g. the technical constraints that impact the design and performance of user-mode threads in Rust or C++ are fundamentally different from those that affect Java (re. e.g. the cost of allocating memory, and where pointers are allowed to point). The constraints around async/await in JS -- where a lot of code is already written under the assumption of no intervention -- are also very different from those in Java, where threads have existed from day one.

10 days ago

_old_dude_

  s/jpackage/jextract/g
11 days ago

cesarb

> [...] and this cost is worth it only when native calls are relatively common. In Java they are rare and easily abstracted away, [...] but it's clearly the right one for Java given the frequency of native calls [...]

Native calls are rare in Java because they're such a pain. If it wasn't so hard to do native calls in Java, it would be common even for non-experts to make use of non-Java libraries.

11 days ago

pron

I don't think so, given that there are more popular Java libraries than popular libraries with a C ABI. There is a small number of very popular C libraries that result in the majority of native call uses. But in any event, calling native libraries in Java is now no longer a pain thanks to FFM (and jextract [1]) so we'll see.

Note that interaction with native libraries often requires a more careful management of native memory that, though much easier now with FFM, is still significantly trickier (and more dangerous in terms of introducing undefined behaviour) than interacting with Java code regardless of how that interaction is declared in code. In Java, as in Python, interaction with native code -- in the vast majority of cases -- is best encapsulated inside a Java library and not often directly exposed to application programmers.

[1]: https://github.com/openjdk/jextract

11 days ago

kaba0

Interestingly enough, this actually turned into a positive over time — also, java was usually fast enough (compared to python) to avoid reaching for native all the time, so it wasn’t as big a pain point, it managed to create an almost completely pure, 99.9% java ecosystem. This means that even very very complex java apps will basically work on every OS, unlike python and to a smaller extent nodejs, where some cryptic dependency is only for windows/linux, etc.

11 days ago

miffy900

> This is similar, except more boilerplate and much, much slower.

That's JNI, which really was truly terrible. Java 22 introducing FFM is finally an admission that JNI was crap and a dead end.

11 days ago

peterashford

JNI worked it just wasnt as ergonomic as it could have been - which was on purpose. I disagree that Java should have discouraged use of JNI in that way but it was hardly "crap and a dead end"

11 days ago

miffy900

> JNI worked it just wasnt as ergonomic as it could have been - which was on purpose

There's a reason they're calling it the 'FFM' API and not JNI v2. The API devs were correct in rethinking the approach to native interop.

This just proves my point; being crappy ON PURPOSE is why it's a dead end; it's very difficult to improve something that's been deliberately designed badly.

Besides that, no Java dev in their right mind is going to continue to use JNI once they upgrade to Java 22 and realise FFM exists.

11 days ago

pjmlp

I will keep using JNI, first of all because I cannot stand its boilerplate instead of having something nice like on .NET side, even jextract can't make up for it.

Secondly, our libraries also land on Android applications, and lets see if FFM ever lands on ART.

10 days ago

peterashford

There's no reason to use it now, yes. It wasn't a dead end because people absolutely could and were using it just fine (if not happily)

11 days ago

trelane

[flagged]

11 days ago

Dwedit

C# does a much better job of calling into C Code. All the programmer has to do is either write a extern function with the "DllImport" attribute, or they can turn a raw function pointer into a delegate. (Or even directly use a function pointer in newer versions of C#)

11 days ago

p0w3n3d

Last time I checked (ca. 2017-9) every call to foreign API in Java had to create a memory barrier causing flush of all CPU cache. This was different to using normal JVM interfaces and when I asked some guy on a Java conference, he told me they cheated during writing of calls to JVM API, but other people need to adhere to rules. I wonder what happened in this matter in Java 22, as this change was highly expected

11 days ago

ryanpetrich

Memory barriers don't force a flush of all CPU cache. They will enforce the ordering of memory operations issued before and after the barrier instruction, preserving the contents of the CPU's various caches.

11 days ago