Compiling and executing C with JavaScript

The world runs in C. The language powers file compression, network communications, and even the browser in which you're reading this article. If the code is not written in C, it still interacts with an ABI written in C (talking about C++, Rust, Zig, etc.) and available as a C library. The C language and C ABIs are past, present, and the future of systems programming.

That's why we developed Bun v1.1.28, which brings experimental support for compiling and executing native C from JavaScript

hello.c

#include <stdio.h>

void hello() {
  printf("You can now compile & run C in Bun!\n");
}

hello.ts

import { cc } from "bun:ffi";

export const {
  symbols: { hello },
} = cc({
  source: "./hello.c",
  symbols: {
    hello: {
      returns: "void",
      args: [],
    },
  },
});

hello();

On Twitter, many people asked us the same question:

“Why would I ever need to compile C programs and then run them from JavaScript?”

Previously, there were two options to use system libraries from JavaScript:

Write an N-API addon (napi) or an addon for the V8 C++ API
Compile code to WASM/WASI using emscripten or wasm-pack

What's wrong with N-API (napi)?

N-API (napi) is a C language API that is runtime independent. Designed to provide native libraries in JavaScript. It is implemented by Bun and Node.js. Before napi, native addons typically used the V8 C++ API, which could result in disruptive changes whenever Node.js updated V8.

Compiling native addons breaks CI

The work of native addons is usually based on the addons script "postinstall" script that allows you to compile N-API addons using node-gyp. node-gyp depends on Python 3 and the latest C++ compiler.

Many people will be unpleasantly surprised that in the CI pipeline it turns out that you need to install Python 3 and a C++ compiler just to build a JavaScript client application.

Compiling native addons is hard work (from a support standpoint)

To deal with this problem, some libraries pre-build packages using fields "os" And "cpu" in package.json. The ecosystem would benefit from delegating that part of the complexity from users to support people, but it's not easy to maintain a build matrix across 10 different target platforms.

@napi-rs/canvas/package.json


"optionalDependencies": {
  "@napi-rs/canvas-win32-x64-msvc": "0.1.55",
  "@napi-rs/canvas-darwin-x64": "0.1.55",
  "@napi-rs/canvas-linux-x64-gnu": "0.1.55",
  "@napi-rs/canvas-linux-arm-gnueabihf": "0.1.55",
  "@napi-rs/canvas-linux-x64-musl": "0.1.55",
  "@napi-rs/canvas-linux-arm64-gnu": "0.1.55",
  "@napi-rs/canvas-linux-arm64-musl": "0.1.55",
  "@napi-rs/canvas-darwin-arm64": "0.1.55",
  "@napi-rs/canvas-android-arm64": "0.1.55"
}

JavaScript → N-API function calls: 3x overhead

What do we get in exchange for complicating assemblies?

When using the JavaScriptCore C++ API, we spend 2 ns calling a simple noop function. When using the N-API, calling the noop function takes 7 ns.

Why do we pay for this with a threefold decrease in productivity?

Unfortunately, the problem is how exactly this API is designed. To make napi work the same regardless of the runtime, you have to dynamically call library functions for simple operations, such as reading an integer from a JavaScript value. It is also for this purpose that the types of arguments are checked at runtime every time a library function is dynamically called. With more complex operations, memory is repeatedly allocated (or objects that are later garbage collected), and multi-level indirection of pointers arises. N-API was never supposed to be fast.

JavaScript is the most popular programming language in the world. Can it be surpassed?

What about WebAssembly?

To get around N-API's inherent build complexity and performance issues, some projects will first compile the native addon to WebAssembly and then import it into JavaScript.

Since JavaScript engines themselves can pull in function calls that make their way to the WebAssembly side, this approach might work.

But system libraries are poorly suited to working with WebAssembly's isolated memory model, and in this case serious compromises have to be made.

In isolation – no system calls

WebAssembly only has access to the functionality that the runtime environment provides it with. Typically this is done by JavaScript.

What about libraries that depend on system APIs, e.g. macOS Keychain API (for securely storing and retrieving passwords) or audio recordings? What if your CLI wants to use the Windows Registry?

When isolating, everything has to be cloned

Modern processors support about 280 TB of addressable memory (48 bits). WebAssembly is a 32-bit language and can only access native memory.

Thus, by default, when passing strings and binary data in the JavaScript <=> WebAssembly direction, the data must be cloned for each operation. In many projects, this eliminates any performance gains gained from WebAssembly.

What if the ability to use JavaScript on the server wasn't limited to N-API and WebAssembly? What if we could compile native C and then execute it from JavaScript using shared memory with virtually zero call overhead?

Compiling and Executing Native C from JavaScript

Let's quickly look at an example in which a random number generator is compiled in C and then executed in JavaScript.

myRandom.c

#include <stdio.h>
#include <stdlib.h>

int myRandom() {
    return rand() + 42;
}

JavaScript code compiling and executing C:

main.js


import { cc } from "bun:ffi";

export const {
  symbols: { myRandom },
} = cc({
  source: "./myRandom.c",
  symbols: {
    myRandom: {

Finally, the conclusion:

bun ./main.js
myRandom() = 43

How does this work?

Using TinyCC bun:ffi compiles, links, and moves C programs into memory. Based on this data, it generates function inlining wrappers that convert JavaScript primitive types to C primitive types and vice versa.

For example, to convert an integer(int) from C to the EncodedJSValue representation for JavaScriptCore, the code performs the following operation:

static int64_t int32_to_js(int32_t input) {
  return 0xfffe000000000000ll | (uint32_t)input;
}

Unlike N-API, such type conversions occur automatically, with zero dynamic dispatch overhead. Because these wrappers are generated at C compile time, you can safely pull in type conversions without worrying about compatibility issues or sacrificing performance.

bun:ffi compiles quickly

If you have previously worked with clang or gccthen you might be thinking:

clang/gcc user: “Great. Now I have to wait 10 seconds each time for C to compile if I want to run this JS.”

Let's measure how long it takes to compile like this when using

bun:ffi

main.js

import { cc } from "bun:ffi";

+ console.time("Compile ./myRandom.c");
export const {
  symbols: { myRandom },
} = cc({
  source: "./myRandom.c",
  symbols: {
    myRandom: {
      returns: "int",
      args: [],
    },
  },
});
+  console.timeEnd("Compile ./myRandom.c");

Conclusion:

bun ./main.js
[5.16ms] Compile ./myRandom.c
myRandom() = 43

It turns out 5.16 ms. Thanks to

TinyCC

compiling C to Bun is fast. We'd have trouble shipping this code if it took 10 seconds to compile.

bun:ffi – costs are low

The Foreign Function Interface (FFI) is notoriously slow. It's different at Bun.

Before moving on to the measurements in Bun, let's determine how much it can accelerate – let's outline the upper limit. For simplicity, let's use Google's checkpoint library (it requires a .cpp file):

bench.cpp

#include <stdio.h>
#include <stdlib.h>
#include <benchmark/benchmark.h>

int myRandom() {
    return rand() + 42;
}

static void BM_MyRandom(benchmark::State& state) {
  for (auto _ : state) {
    benchmark::DoNotOptimize(myRandom());
  }
}
BENCHMARK(BM_MyRandom);

BENCHMARK_MAIN();

Now the output:

clang++ ./bench.cpp -L/opt/homebrew/lib -l benchmark -O3 -I/opt/homebrew/include -o bench
./bench
------------------------------------------------------
Benchmark            Time             CPU   Iterations
------------------------------------------------------
BM_MyRandom       4.67 ns         4.66 ns    150144353

So in C/C++ the call takes 4 nanoseconds. This is exactly the ceiling, you can’t accelerate faster.

How long does this process take when working with bun:ffi?

bench.js

import { bench, run } from 'mitata';
import { myRandom } from './main';

bench('myRandom', () => {
  myRandom();
});

run();

This is what I get on my car:

bun ./bench.js
cpu: Apple M3 Max
runtime: bun 1.1.28 (arm64-darwin)

benchmark      time (avg)             (min … max)       p75       p99      p999
------------------------------------------------- -----------------------------
myRandom     6.26 ns/iter    (6.16 ns … 17.68 ns)   6.23 ns   7.67 ns  10.17 ns

6 nanoseconds. Thus, the cost of calling

bun:ffi

are only 6 ns–4 ns = 2 ns.

What can you collect with this tool?

bun:ffi can work with dynamically linked shared libraries.

Triple speed gain when converting short videos using ffmpeg

If you eliminate the overhead of spawning a new process, and if you don't have to allocate a lot of memory for each new video, then converting short videos is three times faster than usual.

ffmpeg.js

import { cc, ptr } from "bun:ffi";
import source from "./mp4.c" with {type: 'file'};
import { basename, extname, join } from "path";

console.time(`Compile ./mp4.c`);
const {
  symbols: { convert_file_to_mp4 },
} = cc({
  source,
  library: ["c", "avcodec", "swscale", "avformat"],
  symbols: {
    convert_file_to_mp4: {
      returns: "int",
      args: ["cstring", "cstring"],
    },
  },
});
console.timeEnd(`Compile ./mp4.c`);
const outname = join(
  process.cwd(),
  basename(process.argv.at(2), extname(process.argv.at(2))) + ".mp4"
);
const input = Buffer.from(process.argv.at(2) + "\0");
const output = Buffer.from(outname + "\0");
for (let i = 0; i < 10; i++) {
  console.time(`Convert ${process.argv.at(2)} to ${outname}`);
  const result = convert_file_to_mp4(ptr(input), ptr(output));
  if (result == 0) {
    console.timeEnd(`Convert ${process.argv.at(2)} to ${outname}`);
  }
}
}

mp4.c

#include <dlfcn.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/imgutils.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
#include <stdio.h>
#include <stdlib.h>

int to_mp4(void *buf, size_t buflen, void **out, size_t *outlen) {
  AVFormatContext *input_ctx = NULL, *output_ctx = NULL;
  AVIOContext *input_io_ctx = NULL, *output_io_ctx = NULL;
  uint8_t *output_buffer = NULL;
  int ret = 0;
  int64_t *last_dts = NULL;

  // Register all codecs and formats

  // Create input IO context
  input_io_ctx = avio_alloc_context(buf, buflen, 0, NULL, NULL, NULL, NULL);
  if (!input_io_ctx) {
    ret = AVERROR(ENOMEM);
    goto end;
  }

  // Allocate input format context
  input_ctx = avformat_alloc_context();
  if (!input_ctx) {
    ret = AVERROR(ENOMEM);
    goto end;
  }

  input_ctx->pb = input_io_ctx;

  // Open input
  if ((ret = avformat_open_input(&input_ctx, NULL, NULL, NULL)) < 0) {
    goto end;
  }

  // Retrieve stream information
  if ((ret = avformat_find_stream_info(input_ctx, NULL)) < 0) {
    goto end;
  }

  // Allocate output format context
  avformat_alloc_output_context2(&output_ctx, NULL, "mp4", NULL);
  if (!output_ctx) {
    ret = AVERROR(ENOMEM);
    goto end;
  }

  // Create output IO context
  ret = avio_open_dyn_buf(&output_ctx->pb);
  if (ret < 0) {
    goto end;
  }

  // Copy streams
  for (int i = 0; i < input_ctx->nb_streams; i++) {
    AVStream *in_stream = input_ctx->streams[i];
    AVStream *out_stream = avformat_new_stream(output_ctx, NULL);
    if (!out_stream) {
      ret = AVERROR(ENOMEM);
      goto end;
    }

    ret = avcodec_parameters_copy(out_stream->codecpar, in_stream->codecpar);
    if (ret < 0) {
      goto end;
    }
    out_stream->codecpar->codec_tag = 0;
  }

  // Write header
  ret = avformat_write_header(output_ctx, NULL);
  if (ret < 0) {
    goto end;
  }

  // Allocate last_dts array
  last_dts = calloc(input_ctx->nb_streams, sizeof(int64_t));
  if (!last_dts) {
    ret = AVERROR(ENOMEM);
    goto end;
  }

  // Copy packets
  AVPacket pkt;
  while (1) {
    ret = av_read_frame(input_ctx, &pkt);
    if (ret < 0) {
      break;
    }

    AVStream *in_stream = input_ctx->streams[pkt.stream_index];
    AVStream *out_stream = output_ctx->streams[pkt.stream_index];

    // Convert timestamps
    pkt.pts =
        av_rescale_q_rnd(pkt.pts, in_stream->time_base, out_stream->time_base,
                         AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX);
    pkt.dts =
        av_rescale_q_rnd(pkt.dts, in_stream->time_base, out_stream->time_base,
                         AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX);
    pkt.duration =
        av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);

    // Ensure monotonically increasing DTS
    if (pkt.dts <= last_dts[pkt.stream_index]) {
      pkt.dts = last_dts[pkt.stream_index] + 1;
      pkt.pts = FFMAX(pkt.pts, pkt.dts);
    }
    last_dts[pkt.stream_index] = pkt.dts;

    pkt.pos = -1;

    ret = av_interleaved_write_frame(output_ctx, &pkt);
    if (ret < 0) {
      char errbuf[AV_ERROR_MAX_STRING_SIZE];
      av_strerror(ret, errbuf, AV_ERROR_MAX_STRING_SIZE);
      fprintf(stderr, "Error writing frame: %s\n", errbuf);
      break;
    }
    av_packet_unref(&pkt);
  }

  // Write trailer
  ret = av_write_trailer(output_ctx);
  if (ret < 0) {
    goto end;
  }

  // Get the output buffer
  *outlen = avio_close_dyn_buf(output_ctx->pb, &output_buffer);
  *out = output_buffer;
  output_ctx->pb = NULL; // Set to NULL to prevent double free

  ret = 0; // Success

end:
  if (input_ctx) {
    avformat_close_input(&input_ctx);
  }
  if (output_ctx) {
    avformat_free_context(output_ctx);
  }
  if (input_io_ctx) {
    av_freep(&input_io_ctx->buffer);
    av_freep(&input_io_ctx);
  }

  return ret;
}

int convert_file_to_mp4(const char *input_filename,
                        const char *output_filename) {
  FILE *input_file = NULL;
  FILE *output_file = NULL;
  uint8_t *input_buffer = NULL;
  uint8_t *output_buffer = NULL;
  size_t input_size = 0;
  size_t output_size = 0;
  int ret = 0;

  // Open the input file
  input_file = fopen(input_filename, "rb");
  if (!input_file) {
    perror("Could not open input file");
    return -1;
  }

  // Get the size of the input file
  fseek(input_file, 0, SEEK_END);
  input_size = ftell(input_file);
  fseek(input_file, 0, SEEK_SET);

  // Allocate memory for the input buffer
  input_buffer = (uint8_t *)malloc(input_size);
  if (!input_buffer) {
    perror("Could not allocate input buffer");
    ret = -1;
    goto cleanup;
  }

  // Read the input file into the buffer
  if (fread(input_buffer, 1, input_size, input_file) != input_size) {
    perror("Could not read input file");
    ret = -1;
    goto cleanup;
  }

  // Call the to_mp4 function to convert the buffer
  ret = to_mp4(input_buffer, input_size, (void **)&output_buffer, &output_size);
  if (ret < 0) {
    fprintf(stderr, "Error converting to MP4\n");
    goto cleanup;
  }

  // Open the output file
  output_file = fopen(output_filename, "wb");
  if (!output_file) {
    perror("Could not open output file");
    ret = -1;
    goto cleanup;
  }

  // Write the output buffer to the file
  if (fwrite(output_buffer, 1, output_size, output_file) != output_size) {
    perror("Could not write output file");
    ret = -1;
    goto cleanup;
  }

cleanup:

  if (output_buffer) {
    av_free(output_buffer);
  }
  if (input_file) {
    fclose(input_file);
  }
  if (output_file) {
    fclose(output_file);
  }

  return ret;
}

// for running it standalone
int main(const int argc, const char **argv) {

  if (argc != 3) {
    printf("Usage: %s <input_file> <output_file>\n", argv[0]);
    return -1;
  }

  const char *input_filename = argv[1];
  const char *output_filename = argv[2];

  int result = convert_file_to_mp4(input_filename, output_filename);
  if (result == 0) {
    printf("Conversion successful!\n");
  } else {
    printf("Conversion failed!\n");
  }
  return result;
}

Securely store and retrieve passwords using the macOS Keychain API

macOS has a built-in Keychain API for securely storing and retrieving passwords, but it is not exposed in JavaScript. Let's not try to wrap it in N-API and configure CMake using node-gyp, but just write a few lines of C in a JS project and be satisfied with that?

keychain.js

keychain.c
import { cc, ptr, CString } from "bun:ffi";
const {
  symbols: { setPassword, getPassword, deletePassword },
} = cc({
  source: "./keychain.c",
  flags: [
    "-framework",
    "Security",
    "-framework",
    "CoreFoundation",
    "-framework",
    "Foundation",
  ],
  symbols: {
    setPassword: {
      args: ["cstring", "cstring", "cstring"],
      returns: "i32",
    },
    getPassword: {
      args: ["cstring", "cstring", "ptr", "ptr"],
      returns: "i32",
    },
    deletePassword: {
      args: ["cstring", "cstring"],
      returns: "i32",
    },
  },
});

var service = Buffer.from("com.bun.test.keychain\0");

var account = Buffer.from("bun\0");
var password = Buffer.alloc(1024);
password.write("password\0");
var passwordPtr = new BigUint64Array(1);
passwordPtr[0] = BigInt(ptr(password));
var passwordLength = new Uint32Array(1);

setPassword(ptr(service), ptr(account), ptr(password));

passwordLength[0] = 1024;
password.fill(0);
getPassword(ptr(service), ptr(account), ptr(passwordPtr), ptr(passwordLength));
const result = new CString(
  Number(passwordPtr[0]),
  0,
  passwordLength[0]
);
console.log(result);

keychain.c

#include <Security/Security.h>
#include <stdio.h>
#include <string.h>

// Function to set a password in the keychain
OSStatus setPassword(const char* service, const char* account, const char* password) {
    SecKeychainItemRef item = NULL;
    OSStatus status = SecKeychainFindGenericPassword(
        NULL,
        strlen(service), service,
        strlen(account), account,
        NULL, NULL,
        &item
    );

    if (status == errSecSuccess) {
        // Update existing item
        status = SecKeychainItemModifyAttributesAndData(
            item,
            NULL,
            strlen(password),
            password
        );
        CFRelease(item);
    } else if (status == errSecItemNotFound) {
        // Add new item
        status = SecKeychainAddGenericPassword(
            NULL,
            strlen(service), service,
            strlen(account), account,
            strlen(password), password,
            NULL
        );
    }

    return status;
}

// Function to get a password from the keychain
OSStatus getPassword(const char* service, const char* account, char** password, UInt32* passwordLength) {
    return SecKeychainFindGenericPassword(
        NULL,
        strlen(service), service,
        strlen(account), account,
        passwordLength, (void**)password,
        NULL
    );
}

// Function to delete a password from the keychain
OSStatus deletePassword(const char* service, const char* account) {
    SecKeychainItemRef item = NULL;
    OSStatus status = SecKeychainFindGenericPassword(
        NULL,
        strlen(service), service,
        strlen(account), account,
        NULL, NULL,
        &item
    );

    if (status == errSecSuccess) {
        status = SecKeychainItemDelete(item);
        CFRelease(item);
    }

    return status;
}

What can this be useful for?

This is a very low-level template that demonstrates how to use C and system libraries from JavaScript. The same project that uses JavaScript can also use C without an additional build step.

Such approaches are good when working with glue code that holds libraries in C or C-like languages together with JavaScript. Sometimes you need to use a library or system API from JavaScript from the C language, and such a library was not originally intended to be used from JavaScript.

It's usually easiest to write a small wrapper in C to package this kind of code into an API that can handle JavaScript. Here's why:

The examples are written in C rather than JavaScript via an external functions interface.
When working with the external function interface, you will have to constantly mentally switch between JavaScript and C. Pointers are easier to use in C than through the FFI interface, so that in JavaScript they are typed arrays. So isn't it better to simplify it yourself?

Why is this method not suitable?

When working with any tool, you have to give up something.

This method is probably not suitable for compiling large C projects, such as PostgresSQL or SQLite. TinyCC compiles to quite performant C, but it is not designed to handle the advanced optimizations found in Clang or GCC, such as autovectoring or highly specialized CPU instructions.
You probably won't be able to make significant performance gains by micro-optimizing parts of your C code base. I'd be happy to be wrong!

PS Please note that we are having a sale on our website.