Skip to content

shwestrick/sml-fast-real

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sml-fast-real

Standard ML library for faster parsing of Reals (floats/doubles), heavily inspired by fast_float.

Features a zero-allocation fast path that is as much as 8x faster than the Basis implementations of Real.scan and Real.fromString. The slow path currently falls back on Real.scan.

FastReal.from_string is meant to be a drop-in replacement for Real.fromString. Note that there is a disagreement between SML implementations on certain input strings, e.g., see issue #3.

Compatible with the smlpkg package manager.

Performance

Here are timings with MaPLe on my Macbook Air (2022, M2 chip). The input set is generated by test/RealStringGen with approximately 95% of inputs hitting the fast path of FastReal.

MaPLe v0.5.3 (8 threads), `Real.scan`, 1 million input strings (approx 10MB):
avg 0.1208s
min 0.1148s
max 0.1259s
average throughput: 87.8 MB/s

MaPLe v0.5.3 (8 threads), `FastReal.from_chars`, 1 million input strings (approx 10MB):
avg 0.0163s
min 0.0152s
max 0.0195s
average throughput: 650.8 MB/s    <----- ~7.5x improvement over Real.scan

The next major TODO would be to use SIMD/vectorization for even more speedup. See issue #2. With SIMD I wouldn't be surprised if we could get 2-4x additional throughput, perhaps more. I wonder if eventually we could compete in raw performance with fast_float.

Library Sources

There is one source file, tested with both MLton and MaPLe.

  • lib/github.com/shwestrick/sml-fast-real/sources.mlb

Interface

The library defines a functor, FastReal, which takes an implementation of Reals as input.

The input structure needs to also define a function fromLargeWord: LargeWord.word -> real which rounds the input value (interpreted as an unsigned integer) to the nearest representable floating point value.

This function is used on the fast path; ideally, it should have very low overhead and zero allocation. In MLton (and MaPLe), suitable functions are MLton.Real32.fromLargeWord and MLton.Real64.fromLargeWord. See below for example usage.

functor FastReal
  (R:
   sig
     include REAL
     val fromLargeWord: LargeWord.word -> real
   end):
sig
  (* implicitly defines a sequence of characters
   *   [ get(start), get(start+1), ..., get(stop-1) ]
   *)
  type chars = {start: int, stop: int, get: int -> char}

  type result_with_info = {result: R.real, num_chomped: int, fast_path: bool}

  val from_chars: chars -> R.real option
  val from_chars_with_info: chars -> result_with_info option

  val from_string: string -> R.real option
  val from_string_with_info: string -> result_with_info option
end =

Example Usage (MLton, MaPLe)

Example .mlb file:

$(SML_LIB)/basis/basis.mlb
$(SML_LIB)/basis/mlton.mlb    (* for MLton.Real64 *)
lib/github.com/shwestrick/sml-fast-real/sources.mlb
main.sml

Example main.sml:

structure R64 =
struct
  open MLton.Real64 (* need this for fromLargeWord *)
  open Real64
end

structure FR = FastReal(R64)

val r = valOf (FR.from_string "123.456E-1")

About

Standard ML library for faster parsing of Reals (floats/doubles)

Resources

License

Stars

Watchers

Forks

Packages

No packages published