This file documents changes to the Standard ML of New Jersey system since January of 2014 (Versions after 110.78). Earlier changes are documented in the HISTORY-pre2015 file. The change log primarily covers the compiler, the compilation manager (CM), the MLRISC library, and the runtime system. There are occasional entries about other components (e.g., the SML/NJ Library and ML-LPT), but these components have their own change logs that should be consulted.

Version 2022.1; 2022/08/25


Added system/iter-make script to iterate the compilation of the compiler. This script replaces the old fixpt script and is necessary when changing the version number of the system.


The change log file (HISTORY.txt) has been split into two files; one covering the period before 2015 and one covering changes since then.


The source repository has been migrated to GitHub. With this migration, we have substantially reorganized the source tree and build scripts.

The base directory has been eliminated; most of its components (cm, compiler, runtime and system) have been lifted to the top-level of the source tree. The other top-level directories are doc, libraries, llvm, smlnj-lib, and tools. The libraries and tools directories hold most of the non-core components of the system, such as cml, ml-yacc, etc.

The config/ script has been replaced by the script. The build process is simplified because the only component that is not in the git repository is the boot files. We have also removed the deprecated support for 32-bit systems from the various build scripts.


Implemented Basis Library proposal 2021-001 (Add getWindowSz function to Posix.TTY structure).


Implemented Basis Library proposal 2022-001 (Add tau to MATH signature).


Fixed some interfaces that did not agree with the SML Basis Library specification. Bugs #318 (IEEEReal.decimal_approx does not match the Basis Library) and #319 (Type of Real.fromDecimal does not match the Basis Library).


Fixed bug #316 (Real.fromManExp does not return expected value if man = 0.0).


Fixed bug #317 (Conversion from string to real does not accept non-finite values).


Fixed bug #314 (IEEEReal.float_class does not match the Basis Library). For some reason, the NAN constructor took an argument in our implementation. This code was probably an early design of the API that was changed in the Basis Library specification, but not in our code.


Fixed a pretty printing issue that arises when printing nested structure definitions. For example, opening the OS structure in the REPL, would result in an extra newline between the colon and the sig keyword. This behavior is not present in the legacy version.


Adding support for generating a SIG_GC signal when there is a garbage collection. I have also added a function

  val signalThreshold : int -> unit

to the SMLofNJ.Internals.GC structure that allows one to specify the threshold for generating a signal. The default is 1, which means that for any major collection a signal is generated. Setting the threshold to 0 means that minor collections also generate signals, while setting the value to something greater than 1 will filter out collections of younger generations. Collections that happen while a sigGC handler is running are ignored, which should not be an issue for thresholds of 1 or greater.

These changes fix bugs #65 (Garbage collection does not trigger sigGC) and #291 (Signals are not delivered for corresponding events).


Fixed bug #290 (Incorrect pattern matching for exceptons). The fix involves changes to FLINT/opt/lcontract.sml (function swiInfo) and FLINT/opt/fcontract.sml (function fcSwitch). The contraction of a SWITCH with an constructor as subject is suppressed when the constructor is an exception constructor, because for exceptions comparison using the conrep field is inaccurate when the constructor is defined by an exception identity declaration (e.g., exception B = A).


Fixed bug #314 (IEEEReal.setRoundingMode is a no-op on Linux).


Fixed bug #312 (CM.make is unable to handle filenames that contain a backslash). We have changed the semantics of paths given to the functions in the CM structure to be interpreted using the native pathname syntax (instead of CM's generic syntax).


Fixed bug #284 (Compiler bug: Contract: UsageMap on 132). The problem was the the CPSTrans.cpstrans function was generating code for loading spilled parameters in reverse order. In addition to fixing the bug, added some detailed documentation of the code.


Fixed bug #310 (Error when REPL tries to print value of type Posix.FileSys.ST.stat).


Fixed bug #306 (Word8VectorSlice: mapping a subslice produces wrong result or crashes SML/NJ).


Fix a module compilation performance bug by removing packStr and packFct from Elaborator/modules/sigmatch.sml (reducing the size of that file by about 25%), and removing the call of packStr (in function constrStr) in Elaborator/elaborate/elabmod.sml, replacing it with a call to Instantiate.instAbstr. Goodbye at last to packStr!

Version 2021.1; 2021/12/31


Switched over to the LLVM version of the compiler. Doing so has two major consequences:

  • The MLRISC code generator that we have used for over 25 years has been replaced with one based on the[LLVM Libraries].

  • We have dropped support for 32-bit systems. Since we do not have a 64-bit Windows port yet, this version only supports "Unix" systems on the AMD64 processor (we have tested Linux and *macOS). A 64-bit Windows version is a high priority and we hope to include it in the next release.

+ In addition, we have changed the version numbering scheme to a YYYY.NN scheme, where YYYY is the year of the release and NN is the release number for the year.


Fixed a bug in the way that non-printable char values were being printed in the REPL.


Remove last traces of the lambda splitting support from CM.


Changed the Binfile structure (and BINFILE signature) to support both the old binfile format and a new format (both formats are documented in the dev-notes/binfile.adoc file). We are stiull using the old format, but will be switching to the new format soon. The new format has a more structured header and removes the vestigial support for the FLINT lambda pickles.

The new header format includes a proper version number, which should make migrating to new binfile formats easier.


Committed changes to fix bug #281 (Redundant error messages for when a constructor name is misspelled). Revision r7357.

  • Modified Absyn.value datatype by adding an error constructor ERRORid.

  • Revised ElabData/staticenv/lookup.sml to simplify and improve some functions (lookVallookIdPath, lookValSymlookIdSym) and added a function lookIdSymOp that returns an option.

  • Revised pat_id and deleted makeAPPpat in ElabUtil.

  • Revised elabPat in ElabCore.


Committed a massive set of changes from "merge/base" working directory that contained manually merged changes from "newmc" into a fresh checkout of the trunk. Base version of the trunk was r7349, which was tagged to produce $smlnj/sml/tags/pre-merge. The merged trunk version is r7352.

The changes include:

  • A completely rewritten match compiler that transforms absyn to absyn. The old match compiler files in FLINT/trans have been deleted. The new match compiler is documented in dev-notes/match-compiler/match-compiler.txt (and successor documents). The match compiler code is in the new directory base/compiler/Elaborator/matchcomp. FLINT/trans/translate.sml and related files were also extensively revised to work with the new match compiler.

  • PLAMBDA/FLINT types modules in FLINT/kernel have been revised so that the signatures of the main modules (Lty, LtyKernel, LtyDef, LtyBasic, LtyExtern) are all disjoint. Function and record metadata stuctures that were in the FLINT structure (FLINT/flint/flint.sml) have been moved to kernel/funrecmeta.{sig,sml}. This involved minor edits to virtually all files in FLINT and a few in CPS. See dev-notes/FLINT for further documentation.

  • Bug #294 (Compiler bug: Recover) has been fixed by deleting a few lines of code in FLINT/opt/fcontract.sml. Files opt/collect.sml and opt/fcontract.sml been "clarified" to make them somewhat easier to maintain/debug.

  • The VarCon:VARCON structure defined in ElabData/syntax/varcon.{sig,sml} has been renamed Variable: VARIABLE (ElabData/syntax/variable.{sig,sml}), since it no longer contains functionality relating to data constructors.

  • The regression tests ($smlnj/tests) were run and a few true regressions were fixed, including a type error (bugs/tests/bug0573.sml). Many regression failures remain that are due to differences in the pretty printers, reworded error messages, and new printing conventions for overload scheme type variables. Some reference test outputs were updated, but at some point all new outputs should be generated. But first, signature printing should be redesigned and reimplemented.


Changed the binfile representation to remove the pickling of the FLINT intermediate representation. The purpose of this mechanism was to support cross-module inlining, but it has not be enabled for many years. Removing it is the first step toward migrating the pickling infrastructure to use ASDL.


Changed the way that versions are identified to allow a version suffix (e.g., "-rc1" for "release candidate 1"). I removed the bump-release mechanism, which might has served a similar purpose, but was restricted to integer suffixes and has not been used in recent memory. Also changed the SMLNJVersion structure, which is generated by the versiontool. The new interface supports the version suffix and includes both a build date and a release date in the version information record.

Version 110.99.2; 2021/09/23


Fix a benign bug where the size of a floating-point spill record was twice as large as necessary on 64-bit systems.


Split out the Real.toLargeInt implementation into target-word-size versions (the Real64ToIntInf module). For 64-bit targets, the new version uses the bit representation of the real number to compute the result. The 32-bit version is the old code that uses floating-point operations. This change fixes bug #279 (Real.toLargeInt returns zero for anything in range [-512,512]).


In the translation from Absyn to PLambda, there was a function (inlops) that was used to build the primop and type data structures for numeric types. This function was being called for every primitive operator, even though its results only depended on the numeric type. I added a hash table to cache the results indexed numeric kind. This change speeds up the compiler by about 3% (e.g., compiling the compiler went from 58s to 55s on a MacBookPro with a 2.4GHz Intel i9 processor).

Version 110.99.1; 2021/04/12


Changed the AMD64 frame layout to include a word to hold the Overflow exception. This value is used by the LLVM backend to generate the exception for checked arithmetic operations.


Added support for running SML/NJ on M1 Macs via the Rosetta2 emulator. The change is to identify the arm processor as amd64 in the config/_arch-n-opsys script. Note that while the system basically seems to work okay under rosetta, trying to run the makeml command after having compiled the complier caused a crash.


Some minor restructuring of the logic in the generic installer.


Fixed a bug with how FLINT numeric types were being translated to CPS types. Specifically, types that were smaller than the default integer size (e.g., word8) should have been marked as having a tagged representation.


Fix for bug #280 (110.99 config/ -64 fails on macOS 10.15.7). I was unable to reproduce this problem, but after some investigation, it appears that the problem was inconsistent build tools being picked up from the user’s path. To protect against this issue, I made the paths to the ar and ranlib tools absolute.


Fixed a serious performance bug in the implementation of the CharBuffer and MonoBuffer structures. Essentially, if one did not reserve sufficient space for the contents, it could take quadratic time to fill the buffer. We now grow the buffer by a factor of 1.5 of its current size, with an upper bound on the extra growth of 256K.

Version 110.99; 2020/12/24


Changed the layout of the SML stack frame on the AMD64 architecture to make it compatible with the way that LLVM spills registers. Essentially, this just involved swapping the order of the swap area and the ML stuff. We took this opportunity, however, to localize up the representation of this information in the compiler.


Changed the format of the "magic string" in the header of binfiles. The new format is "arch-version", where the architecture name is limited to at most seven bytes and the version is limited to at most eight bytes. The string is padded with spaces to a total length of 16 bytes.


Various pretty-printer bug fixes:

  • bug #274 (Minor pretty printing glitch when printing structure specs)

  • bug #276 (Missing option to control extra newlines in REPL)

  • bug #277 (Excess white space when pretty printing a module signature)


Fixed bug #254 (Real.fromLargeInt produces negative results). The problem was because the digit size for is only one bit smaller than the default int and the scaling factor rbase was being computed using the InlineT.Real64.from_int function (so rbase ends up being negative). Thus it would return incorrect results whenever the IntInf representation involved more than one digit. This is a bug on both 32-bit and 64-bit systems. The fix was to switch to using InlineT.Real64.from_int{32,64} to convert rbase and digits to real values.


Fixed bug #267 (Returns an incorrect result for a calculation on for 32-bit mode). The problem was that on 32-bit machines, 64-bit division is implemented by the IntInf module with the result then being converted to 64-bits. The conversion used did not test for overflow in the result.


Changed the semantics of the spans returned by ml-ulex so that the second component of a span is the position of the rightmost character in the token (instead of the character following the token). Specifically, the span \((p_1, p_2)\) specifies the \(p_2 - p_1 + 1\) characters that start with the character at position \(p_1\) and run to \(p_2\) (inclusive). This change avoids a potential problem when the span of a token ends at the last character in a file (when the input is spread across multiple files).


Simplified the binfile representation by removing the option of having multiple code objects. Many years ago, we would split the code for a compilation unit into multiple independent functions so that the garbage collector could reclaim code that was only executed once (or was not referenced). The actual splitting of the code in the CpsSplitFun functor (CPS/clos/cps-split.sml) was replaced by a dummy implementation at some point, so we have not been generating multiple code objects for some time. Therefore, we have simplified the code generator to assume only one code object and have changed the binfile import/export code to only support a single code object per binfile.

Also made this change to the bootstrap loader (kernel/boot.c).

Version 110.98.1; 2020/08/25


Reverted some of the pretty printing changes that were made in 110.98 to the 110.97 version (the renaming of PrettyPrintNew to PrettyPrint and the directory reorganizations were unchanged). These changes fix bugs #266 (Pretty printing regression in SML/NJ 110.98), #268 (Polymorphic Type Pretty Printing Regression), and #271 (pretty printer regression for structure binding).


Fixed bug #269 (Word64.fromString causes an Overflow for greater than 232-1). This bug was the result of constants from the 32-bit version of the code not getting updated for the 64-bit version. Scanning of both hexadecimal and octal representations of both integers and words were affected.


Added an additional lowering pass for the STREQL primop. This primop is generated to implement pattern matching against string literals. Previously it was unrolled in the MLRISC code generator, but we now do the unrolling in CPS. The reason for this change is that implementing the unrolling in the LLVM code generator would be complicated because of the need to introduce phi nodes in one of the branches.

The unrolling in CPS is somewhat different from before in that we now bake the literal string being tested into the equality tests.

Also changed the representation of the primop to include the string being tested against and removed the STRNEQ primop.

Version 110.98; 2020/07/16


Changed the config/ script so that the default size is 64 bits for any machine that reports its machine as "x86_64."


Fixed bug #260 (Perform divide on crashes with FPE on Linux). The fix required adding SIGFPE as a second source of Overflow exceptions on Linux/amd64 machines.


Fixes for structure and signature pretty printing problems that were introduced in the extensive pretty printer/pretty printing modifications around revision 6291.

Files affected include ElabData/modules/ppmod.sml, ElabData/types/pptype.sml, TopLevel/print/ppdec.sml, and Basics/print/pputil.{sig,sml}.

The pretty printing for modules still seems quite fragile, so there are likely to be more pretty printing problems to be fixed later. In particular, pretty printing of functor and functor signature declarations haven’t been tested.


Added a new lowering pass following CPS optimization, but before closure conversion. This pass includes the previous passes for 64-bit operations on 32-bit matchines and for conversions involving It also adds lowering for div and mod to native machine division (i.e., quot and rem) and for trapping conversions.

The purpose of this change is to simplify code generation in preparation for migrating to a LLVM-based backend.


Fix for bug #261 (Weird "calc_strictness" message being printed). Rewrote the function ElabUtil.calc_strictness and moved it to TypesUtil.calcStrictness.


The HTMLDev structure in the pretty-printing library has been moved into its own library ( and renamed as HTML3Dev. This change removes the dependency of on, which allowed us to remove all mention of from the compiler CM files.


Completed the removal of the trigonometry operators from the primops. This change also allowed the removal of extension support from the MLRISC code generator for the x86.


Disabled the use of the hardware instructions for the basic trig functions on the x86. Doing so simplifies cross compilation from non x86 hosts and also paves the way to removing the operations from the compiler’s primitive operators.


Improved the CPS contraction phase by adding strength-reduction optimizations to ContractPrim. These include recognizing when multiplications and divisions by powers of 2 can be replaced by shifts. Previously, these sorts of optimizations were provided by MLRISC, but we plan to simplify the CPS IR prior to code generation by replacing div and mod operations with native machine arithmetic, which would prevent MLRISC from making the optimizations.


The MLRISC instruction selector for the x86 and amd64 targets erroneously assumed that the idiv instruction sets the OF (overflow) condition code when dividing the largest negative number by ~1. In fact, such a division operation traps, which is okay, because the runtime system maps the trap to the Overflow exception. Since the check for overflow is unnecessary, it has been removed from the files MLRISC/amd64/mltree/amd64-gen.sml and MLRISC/x86/mltree/x86.sml.

Note that MLRISC's non-trapping signed division operations can actually trap on overflow, but this was true before this change.


Changed the semantics of the --debug command-line option for ml-antlr. Previously this option replaced the actions with a print expression, but that limited its usefulness because of type errors in the generated code. The new behavior is to preserve the existing actions and just add the printing code.


Added a pass to the elaborator that check for variables that are bound, but never referenced. This check can be controlled by the Control.Elab.unusedWarn flag. Unused top-level variables are not reported (unless they are bound in the local part of a local declaration).

The check is currently disabled because of false positives caused by a transformation in the type checker. For example, the following function declaration:

fun foo n = let
      fun f x = g x - 1
      and g x = f x + 1
        f n

gets represented by the following Absyn:

val foo = let
      val foo = (fn n => let
                val tmp = let
                      val rec f = (fn x => Int.- (g x,1))
                          and g = (fn x => Int.+ (f x,1))
                      in (f,g)
                val f = #1 tmp
                val g = #2 tmp
            in f n
      in foo

where the instance of g bound to #2 tmp is unused. This transformation is done by the wrapRECdec function. The unused-variable implementation was influenced by Jacob Van Buren’s patch for Version 110.82.


The LambdaVar.lvar type is represented as an integer; since the earliest days of the compiler this representation has been concrete, which meant that the type system was not able to provide any guarantees that int`s and `lvar`s were not being mixed up. As of this change, `LambdaVar.lvar is now an abstract equality type (internally, it is still and int). Some comments about the changes:

  • The LambdaVar structure now includes substructures that implement hash tables (LambdaVar.Tbl), finite maps (LambdaVar.Map), and finite sets (LambdaVar.Set).

  • most of the changes involved replacing IntHashTable, IntRedBlackSet, IntRedBlackMap with the equivalent substructures that were added to the LambdaVar structure (e.g., IntHashTable =⇒ LambdaVar.Tbl).

  • there were a few place where debugging code assumed that lvars could be printed as integers.

  • the pickling code requires a mechanism to convert between integers and lvars; this is the one place where the abstraction is broken.

  • the worst abuse of the fact that the lvar type was int was in the code generator, where arithmetic was used to generate a unique negative number that could be used as a hash key, so that a given lvar could be mapped to two different labels. I fixed this by using two tables.


Fixed bug #256 ( incorrect). The original source of the bug was the Basis Library sample code, which has also been fixed.

Version 110.97; 2020/04/21


Changed the printing of tyvars; e.g., an OVLDV tyvar introduced by an occurrence of the overloaded operator "*" that also acquires the equality attribute will be printed as ''Z[OL(*)].


Eliminated AbsDec and ABSdec constructors

  • Eliminated AbsDec constructor in Parse/ast/ast.{sig,sml} and ABSdec constructor in ElabData/syntax/absyn.{sig,sml} The "abstraction" declaration that these constructors implemented was in SML/NJ 0.93, but was eliminated in favor of opaque "sealing" signature ascription after SML '97.

  • Eliminated all other occurrences of these constructors throughout the compiler (Front End, FLINT, and cm).


File structure reoganization: the top-level compiler/Semant directory was eliminated. Remaining relevant subdirectories were Semant/pickle, which moved to ElabData, and Semant/prim, whose two files, prim-env.sml and primop-bindings.sml moved to the existing ElabData/prim directory. CM files compiler/ and ElabData/ changed accordingly.


Fixed bug #220

  • Major redesign of the overloading resolution mechanisms. Changed syntax of overload declaration (partly deferred to 110.98). Changed OVLDvar in VarCon.var, added files overloadclasses.sml and overloadvar.sml to Elaborator/types. Changed Types.tvKind in ElabData/types/types.{sig,sml}, splitting OVLD tvKind into OVLDV (overloaded variables/operators), OVLDI (overloaded int literals) and OVLDW (overloaded word literals). Modified treatment of the overload metavariables in Unify (Elaborator/types/unify.sml).

  • Changed printing of overload type metavariables (unification type variables (Types.tyvar)).

  • Files: ElabData/types/types.{sig,sml} ElabData/types/overloadclasses.sml (new) ElabData/types/overloadvar.sml (new) ElabData/types/overload.sml Elaborator/elaborate/elabcore.sml


Fixed bug #214

  • Changed printed message when a VALvar binding is shadowed to print “<hidden>” (function ppVar in MiscUtil/print/ppdec.sml).

  • Minor cleanup of dontPickle function in Semant/pickle/pickmod.sml.


Fixed bug #209

  • Added function checkForbiddenCons to ElabUtil (Elaborator/elaborate/elabutil.{sig,sml}) that checks if a symbol is in the forbidden constructor set (it, true, false, nil, ::, and ref).

  • Modified elabEXCEPTIONdec to check for forbidden exception constructor names. Rewrote function elabEb to simplify.

  • Added a test for forbidden constructor names in function elabConstr within elabDB.

  • Changed specs for types list and bool to datatype replication specs in system/Basis/Implementation/{list,bool}.sig to avoid an error caused by the occurrence of "forbidden" constructors.


Some clean up in the ml-lex/lexgen.sml code. Replaced the one-off implementation of finite maps with the RedBlackMapFn functor from the SML/NJ Library. Also got rid of the uses of polymorphic equality by changing token equality tests to pattern matching.


Turned several functors (ElabModFn, ElabTopFn, SigMatchFn, etc. into structures and removed the redundant functor application files (and the directories) in Semant/elaborate and Semant/modules.


Fixed bugs #195 and #196

  • #195: added missing DOdec case in function getDeclOrder in ElabMod (Elaborator/elaborate/elabmod.sml l. ~563)

  • #196: modified elabDOdec in Elaborator/elaborate/elabcore.sml to return the empty Static environment (SE.empty)


Various minor changes related to the heap2exec and heap2asm programs.

  • Modified the config/ script to remove bin/heap2exec when the required helper bin/heap2asm is not installed.

  • Added -static and -dynamic as options to heap2exec (these are the same as --linkwith-a and --linkwith-so)

  • Rewrote heap2asm be a bit more future-proof.


Addressed bug #247 (@SMLVersion should report 64/32 bit) by adding a new command-line option (@SMLwordsize) to the .run-sml command script. Specifying this option will cause the the wordsize to be printed (either 32 or 64) and then the program will exit.


Fix for bug #252 (Boyer Benchmark Compile Failure). This crash was caused by a typo in the CPS/main/build-literals.sml code that caused an incorrect opcode to be generated for SAVE/LOAD instructions when the offset was >= 256.


Fixed the calculation of the maximum array/vector length for 64-bit targets. We had been using the calculation for 32-bit targets.


Fix for bug #245 (Lazy data types result in Compiler Bug error). The problem was that a number of symbols (e.g., deref) had been dropped from the _Core structure, but were required to support the lazy (and profiling) features in the compiler. The symbols have been reinstated and a comment has been added to explain why they are being included.


Fix for bug #244 (Compiler bug: PPObj: ppFields in ppval.sml). The code generator was using the wrong length tag for raw64 records on 64-bit machines (twice the length).


Clean up various issues in the configuration/build machinery for asdl. This includes a fix for bug #240 (Non-default 64-bit installation build failure)


Fix bug #239 (Date.toTime is incorrect (by a factor of 10E9)). Thanks to Johannes 5 Joemann for both the report and fix.

Version 110.96; 2019/12/13


Bug fix for a problem where ^C (and other signals might be ignored). The fix is to use word-sized fields in the VProc state vector so that the word-sized move operations in the assembly code do not clobber adjacent fields.


Fix for bug #234 (Converting NaN to a string causes an infinite loop on 64-bit machines). The problem was in MLRISC/amd64/mltree/amd64-gen.sml, which was not generating comparisons that work correctly when the arguments are unordered.


Removed assertion checking from the amd64 runtime makefiles. It has not turned up any errors since 110.94 was released, so we will assume that things are working the way that they should.


Bugfix for bug #237 (heap2exec script fails on 110.95). The fix was provided by Kirill Boltaev.


Changed the default installation size to 64 bits on macOS 10.14 Mojave and later.


Fixed some code rot in the eXene sources (bug #233). With the LargeWord module changing from Word32 to Word64, there were a few places were things broke.


Added support for 64-bit executables on FreeBSD. As part of this effort, we fixed a couple of regressions (makefile issues) for the 32-bit version on FreeBSD that were introduced when the X86.prim.asm file was rewritten. We also switch from BSD signal handling to POSIX signal handling, since that is what we use for most other systems.


Fixed config/ script, which was not passing the size option to the .link-sml, which caused confusion for the "-64" flag. This problem was later reported as bugs #235 and #236.


Many years ago, SML/NJ had a bytecode interpreter, but it was mostly removed from the system a long time ago. There were, however, some remnants of it in the runtime system. These have now been removed.

Having made this change, the distinction between the "target" and "host" architectures is no longer necessary. Therefore, these have been merged into a single architecture property. The effects of this merge are as follows:

  • the TARGET_xxx and HOST_xxx C-preprocessor symbols have been replaced with a single ARCH_xxx symbol in the runtime system.

  • The SMLofNJ.SysInfo structure now provides getArchName and getArchSize functions.

  • The following SMLofNJ.SysInfo functions are deprecated and will be removed in 110.97: getHostSize, getHostArch, and getTargetArch.

Version 110.95; 2019/11/09


Fix for bug #230 (New literals-lifting code does not handle pair of reals).


Simplified the runtime-system build rules for Cygwin.


Created the script config/, which implements the fetching and unbundling of source and bin files in preparation for a Windows installation.


Fix for bug #229 (Real.fromString errors). This bug was actually two unrelated issues. The problem that Real.toString returns Real.posInf for 0.0e123213213123213123123 has been fixed in the RealScan module (system/basis/Implementation/real-scan.sml). The second bug was a regression introduced in 110.93, where the SIGFPE signal was specified as the result of the into instruction, whereas Linux actually signals SIGSEGV for into. Note that there may be a related issue of BSD systems, where SIGBUS might be the signal, but we need access to a test machine to verify.


Fix for bug #230 (segmentation fault when compiling MLton sources with SML/NJ 64-bit). The problem was that when a large vector was being created, the assembly code did not correctly restore the stack state before trying to call the runtime system to do the allocation.


The runtime now uses MAP_ANON for allocating memory on 64-bit Linux. This change fixes a problem with versions of Linux that do not allow access to /dev/zero (such as on ChromeBooks).

Version 110.94; 2019/10/31


Modified the generic installed (base/system/smlnj/installer/generic-install.sml) to support conditional targets. You can now write tests like


The symbols that can currently be tested for are SIZE_32, SIZE_64, UNIX, and WINDOWS. See the config/targets file for more details.


Fixed bug #227 (CPS contraction is taking an excessive amount of time on word8 basis test).


Modified the CPS contraction phase to optimize the case where a numeric conversion is applied to a constant value.


Modified the Unix installer script (base/system/smlnj/installer/nix-install.sml) to pass a size argument to the configuration script. This argument is used by the ASDL configuration.


Overhauled the installation script (config/ and various script templates (e.g., config/_run-sml) to allow setting the default size. The config/ script now supports the following arguments:

-default size

specify the default size for the sml and other commands, where size is either 32 or 64.


install the 32-bit version of the system.


install the 64-bit version of the system.

It is possible to install both versions in the same location by running the script twice. For example, the commands

% config/ -32
% config/ -default 64

will install both versions with the 64-bit version as default. One would then use the command sml -32 to run the 32-bit version of the system. Note that the default version must be installed second.


Added support for the -64 flag to the fixpt script in base/system.


Added support for the -64 flag to the cmb-make script in base/system.


Renamed the REAL representation constructor to Raw64, which matches what is going on in the runtime system. Also renamed the toReal function to toReal64.


Updated the SMLofNJ.SysInfo structure by removing constructors from the oskind datatype that correspond to obsolete systems. Also added a getHostSize function that returns the host architecture’s native word size in bits (e.g., 32 or 64).


Added the -64 flag to the testml script in base/system and to the .run-sml script. Thus, one will be able to specify the 32-bit version of SML/NJ using the command sml -32 and the 64-bit version using the command sml -64. Currently, 32-bits is the default, since the 64-bit system is unstable.


Removed obsolete operating systems from the SMLofNJ.SysInfo.os_kind datatype. This change reduces the type to two constructors: UNIX and WIN32. Also added a function getHostSize to the SysInfo structure, which returns the host word size in bits (i.e., either 32 or 64). The word size is now reported in the compiler’s banner message at startup.


Fixed bug #130 (failure to raise Bind exception). Added a function refutable to ElabData/types/typesutil.{sig,sml} and used it to limit type generalization of val bindings in Elaborator/types/typecheck.sml. The fix does not deal properly with refutability of OR patterns, but OR patterns in val bindings is a dubious feature.

This change also fixes bug #188 (Missing warning for nonexhaustive valbind patterns), bug #190 (Unexpected exception in SML/NJ with invalid list pattern match), and #199 (Compiler bug in pretty printing of result).


Modified the cmb-make script to support passing compiler control flags to the build command. The flags should be specified after the path to sml command (if it is given).


Finished the implementation of the new literal bytecode engine. There is a control flag (Control.CG.newLiterals that allows switching between the old and new bytecodes).


Fix for bug #225 (Math.ln giving erroneous answers on Windows). The problem was an inconsistency in the way the Unix and Microsoft assemblers interpreted the addressing mode for the FLD instruction.


Clean up in the Basis Posix library code (both SML and runtime) to be consistent about when the SysWord.word type is being used to communicate information between SML code and the runtime system.

Version 110.93; 2019/09/05


Add support for specifying a 32 or 64-bit target as command-line option to the .arch-n-opsys and .link-sml scripts. The default size is currently 32-bits, but that will change once 64-bit support is solid.


Generalize code generation for conversions involving tagged integers/words, where the size is not the default integer size. This situation only occurred for Word8.word values on 32-bit targets, but also occurs for 32-bit values on 64-bit targets.


Rewrote the expansion of the INLLSHIFT, INLRSHIFTL, and INLRSHIFT primops (compiler/FLINT/trans/transprim.sml). The expansion process now correctly handles shift operations on types that are smaller than the default tagged-integer size. This change also allows the Word8 shift operations to be inlined.


Fixed a bug in the constant folding of arithmetic-right-shift operations. The sign was not getting extended for words when the most-significant-bit was set.


Fixed a bug in the Real.toManExp function (the exponent was off by one, which meant that the mantissa was two times its expected value). This fix also fixes a problem in Real.toLargeInt, where the function would go into an infinite loop in some cases.


This change probably also fixed bug #208 (Real.toManExp produces incorrect results in some cases).


Fixed bug #173 (OS.Process.sleep only works with whole numbers). For systems that have finer-grain sleep function, such as the nanosleep(2) system call, the OS.Process.sleep and Posix.Process.sleep functions now support sub-second granularity.


Restructured the CPS contraction phase to make the fusion of integer/word conversions more uniform. Also fixed a bug where Int32.fromLarge(Word32.toLargeInt 0wxffffffff) would return ~1 instead of raising Overflow. The problem was that TEST(m,n) o COPY(n,p) was getting fused to COPY(m,p) when m = p, instead of TRUNC(m,p).


Int64 comparisons were not always correct, which lead to some positive values being printed as negative numbers (basically when the sign bit of the lower word was set).


Added Unsafe.IntInf structure, which provides access to the internal representation of the type. Note that this representation may change in the future.


Fixed bug #223 (Incremental Build fails on Windows). There was a missing CloseHandle() when getting a file’s timestamp.

Version 110.92; 2019/08/10


Changed base/system/allcross script to use cmb-cross script. Also modified the cmb-cross script to build compressed tar files, when given the -z option, and to clean up intermediate files.


Restructured the amd64 machine-code generation implementation and filled in many of the missing encodings. It should be complete for SML/NJ code generation, but needs more work to support the full set of operations described in the amd64.mdl file.


Some cleanup in the x86 MLRISC backend. Removed the MULB, MULW, and MULL unsigned-multiplication instructions, since they are not binary operations. The MULL instruction is covered by the MULL1 constructor in the multDivOp datatype. The same change was applied to the amd64 backend.


Many changes to the amd64 machine description:

  • Removed the PUSHB, PUSHW, and PUSHL instructions, since the matching POP operations are not supported.

  • Removed the CALLQ operation, since it is the same as CALL.

  • Removed the CLTD and CQTO operations, since those names are just synonyms for CDQ and CDO.

  • Replaced the INTO operation (which is not valid in 64-bit mode) with INT of byte.


New script for cross compiling to other architectures; the script is still called cmb-make, but now supports target-specific dependencies in the front-end (i.e., representation of numeric types and endianess). The cross compilation scheme was developed by Matthias Blume and then encoded in a script.


The runtime system now builds for the amd64 architecture. Most of the changes relate to the difference between the flat BIBOP on 32-bit platforms and the two-level BIBOP on 64-bit platforms.


Fix bug #224 (Word64.fromLargeInt fails). The problem was an incorrect record kind in CPS/opt/infcnv.sml (it was RK_RECORD instead of RK_RAWBLOCK).


Changed the rep datatype constructor Word32 to Raw (which covers both 32 and 64-bit numbers on 32-bit platforms). We now check the length of the raw object when converting to an concrete numeric type.


Removed the use of runtime-type passing for polymorphic arrays. The effect of this change is that code that uses the Array.array type will be faster when the element type is not real (e.g., sorting an array was 1.2 times faster), but slower when the type is real. Use the monomorphic type RealArray.array for best performance on arrays of reals.

Version 110.91; 2019/06/20


We added a new primop, REAL_TO_BITS that casts a floating-point value to the same-size word value. This primop allows the Assembly.logb function to be implemented in SML.

We have also refactored the implementation of the Math structure to share common code across the versions that are specialized for different levels of hardware support.


Rewrote the assembly code for the x86 and AMD64 targets. Previously, there were separate source files for Unix and Windows; these have been replaced by a single common file (one for each architecture). The assyntax.h file has also been replaced by x86-syntax.h, which covers both the x86 and AMD64 on both UNIX and Windows.

The AMD64.prim.asm file now compiles, although there are a few minor issues that will have to be fixed once we have a working code generator. We have also fixed a number of issues in the garbage collector related to the use of the 2-level BIBOP on 64-bit targets.


Some cleanup in the interval-timer code. In keeping with the other time-specific functions, I have switched the runtime-system API to use unsigned 64-bit nanoseconds to specify time values. I have also added an implementation for c-libs/smlnj-runtime/itick.c, which was missing. Lastly, moved the Windows-specific file win32-timers.c from runtime/kernel to runtime/mach-dep.


Added 64-bit implementations of the target-specific Basis Library modules in directory Basis/Implementation/Target64Bit.


Added PackWord64Big and PackWord64Little structures to Basis Library. Note that the implementation of these is target-specific.


Added bigEndian flag to the TARGET signature.

Version 110.90; 2019/06/12


Fixed the Concurrent ML library to use 64-bit positions (both Unix and Windows) versions.


Moved the year offset from SML to the runtime system. This change is necessary because Windows uses 1601 as year 0, whereas UNIX uses 1900. We have also switched to using unsigned 64-bit times in nanoseconds as the interface between the Basis code and runtime system. This change is consistent with the other places where time values are communicated between the runtime and SML code.


Fixed a problem with CM’s symbol filtering (see bug #222).

The problem could manifest itself when a library imported two symbols A and B from and then exported the same A but a different B (which could have been defined in terms of the imported B). Moreover, for the problem to occur both A and B within must have come from the same SML source file.

With the above setup, when running

CM.make "";

it was possible that instead of seeing the new A defined within one would still see the original version that came from


Various 64-bit porting changes to the Windows implementation of the Basis Library and runtime system:

  • Add a target-specific Handle structure to support the HANDLE type, which is a pointer-sized word value.

  • Changes to support the use of 64-bit file positions.

  • Replaced pairs of argumnents representing time values (seconds and microseconds) with a single 64-bit count of microseconds.


Implemented Basis Library proposal 2019-001 (Correction to the PRIM_IO signature). This proposal changes the return type of the avail function in a reader to be option, which is necessary to support large files.


Added primop support (PTR_TO_WORD and WORD_TO_PTR) for the c_pointer type that was added in 110.89. These primops are exposed in the new InlineT.Pointer structure. We define a PointerImp structure that is used inside the Basis implementation and a Unsafe.Pointer structure that is visible to users.

Version 110.89; 2019/06/01


Switched the Position structure to be bound to Int64 and updated the runtime system to use 64-bit integers for file offsets and time values (in nanoseconds). This change fixes bugs #33 (Overflow exception with inputLine function) and #36 (Can’t open very large file).


Added abstract c_pointer type to the primitive types. This type will be used to represent runtime-system pointers (e.g., the HANDLE values in the Windows implementation).


Removed makefiles and code for architectures and operating systems that are no longer supported (e.g., the DEC Alpha and HPPA architectures).


Switched the FixedInt and LargeWord structure aliases to be 64-bits (i.e., FixedInt is now bound to Int64 and LargeWord is bound to Word64).


We are now assuming that we have at least C99 support (for practical purposes, this assumption is even true on Windows). With this assumption, the allocation of small objects in the runtime has been switched from macros to inline functions (see runtime/include/ml-objects.h). This change allows a graceful handling of 32-bit integers, which are heap allocated on 32-bit machines, but tagged on 64-bit machines.


Fixed various bugs in the implementation of the Word64 operations. The addition and subtraction operators were using arithmetic right shifts, instead of logical right shifts. Also, the translation of 64-bit shift operations was incorrect because of a typo in the variable names.


Created a simplified version of the MLRiscGen functor. This version of the functor, which is in the file CodeGen/main/mlrisc-gen-fn.sml does not include the memory disambiguation and GC types code. Since the old version (CodeGen/main/mlriscGen.sml) did not use these features by default, there should be no difference in the quality of the generated code.

The purpose of this change is to remove unused code that has 32-bit dependencies.


Added contraction for unsigned REM and NEG operations in CPS/opt/contract-prim.sml.

Version 110.88; 2019/05/15


Moved the compiler/DEVNOTES directory to the dev-notes tree and renamed it old-compiler-notes.


Added 64-bit versions of NumFormat and NumScan. We use the 32-bit version for numbers of 32-bits or less and the 64-bit versions for numbers with up to 64 bits. Thus, on 32-bit machines, the default int and word types use NumFormat32 and NumScan32, while on 64-bit machines they use NumFormat64 and NumScan64. This change also required splitting out some common code into a ScanUtil structure and also splitting out the scanning of real numbers into the ScanReal structure (formatting of reals was already in its own structure).


Reimplemented the 64-bit int and word types to put them on a (mostly) equal footing with the other precisions. In this new implementation, the basic types int64 and word64 are now PRIMITIVE (instead of being ABSTRACT type represented by pairs of boxed 32-bit words). Arithmetic and comparison operations on these types are represented as primops and are preserved as such up to just before closure conversion. At that point, the new Num64Cnv structure (compiler/CPS/opt/numcnv.sml) is used to expand 64-bit operations and constants into 32-bit operations. Most of the 64-bit primops are inline expanded, but multiplication and division operations are converted to calls to library code from the CoreInt64 and CoreWord64 modules (system/smlnj/init).

Because the type are primitive, we were able to change the runtime representation to use packed records (RK_RAWBLOCK) to represent them, which saves space and should also help with performance.

See the dev-notes/ file for more details about the implementation.


Reorganized the Basis Library source files (system/Basis) to isolate dependences on target word size.

In the Basis/Implementation directory, I created subdirectories (e.g., Target32Bit) to hold implementations that are specific to the target. These directories include a bind-structs.sml file that replaces the many bind-*.sml files in Basis/Implementation.

In the Basis/Exports directory, I replaced the many individual files (each with a single module renaming) with bind-common.sml (for target-independent bindings) and a target-specific file (either bind-target-32-bit.sml or bind-target-64-bit.sml).


Some of the CPS optimization modules (Expand and EtaSplit were written as functors over the machine spec, when, in fact, they never reference their functor argument. Therefore, they have been converted to structures.


We now use the InlineT.identity primop for, so the compiler can optimize it.


Fixed pretty-printing regression in 110.87; value of char type were missing their enclosing quotes.

Version 110.87; 2019/05/03


Made the Char.chr operator inline (a primop was added to support this change in 110.86).


Major renaming of the primitive operators in the Inline structure (as described in dev-notes/ Also cleaned up the Basis Library implementation to remove most (but not all) 32-bit dependencies.


Added cases to the top-level pretty printer to handle the new basic types that were added in 110.86 (e.g., word8vector and chararray). Also changed the way that primitive types are handled to use a table keyed by tycons, instead of a sequence of nested conditionals.

Version 110.86; 2019/05/02


Added word8vector and chararray to the primitive types that the compiler knows about. These will be used in the rewriting of the InlineT structure.


Replaced the Primop.primop constructors NUMSUBSCRIPT and NUMUPDATE with

`sml | NUMSUBSCRIPT of numkind | NUMSUBSCRIPTV of numkind | NUMUPDATE of numkind | INLNUMSUBSCRIPT of numkind | INLNUMSUBSCRIPTV of numkind | INLNUMUPDATE of numkind `

This design matches the naming conventions for polymorphic subscripting and updating.


Added Primop.INLCHR to implement Char.chr as an inline function. This change also required moving the definition of the Chr exception to the Core module so that it is accessible to the translate phase. The inline version of Char.chr will be enabled in the 110.87 release (we need the internal primop before we can use it).


Major overhaul of the representation of primitive operators (both in the Primop and CPS.P structures). The primitive arithmetic and comparison operations are now defined in the ArithOps structure (ElabData/prim/arithops.sml). There are three datatypes defined in this module

  • arithop — integer arithmetic operations that may raise overflow

  • pureop — arithmetic operations that are pure

  • cmpop — comparison operations

These types are used in both the Primop and CPS.P modules, which makes the translation between representations more direct.

Some details:

  • inline division and modulo operations were added to the Primop.primop datatype; the expansion of these in the TransPrim module (FLINT/trans/transprim.sml) adds explicit checks for division by zero.

  • the FSGN operator was added to the Primop.primop datatype, since the new cmpop datatype does not include it (the CPS IR already had FSGN as a separate branch constructor).

  • unsigned comparison operations are now represented by using the UINT numkind, which is consistent with how they are represented in CPS.

  • Renamed the primop ROUND to REAL_TO_ROUND.

  • the encodings for operators were revised in the pickler, resulting in a more compact use of the numeric codes.


Removed unused record kind constructors (RK_SPILL, RK_EXN, and RK_BLOCK) from CPS.record_kind datatype. Also renamed RK_I32BLOCK to RK_RAWBLOCK and RK_FBLOCK to RK_RAW64BLOCK. Various other renamings to remove 32-bit assumptions.


Renamed DTAG_raw32 to DTAG_raw, since the semantics on 64-bit systems will be to require word-size aligned raw data. Also renamed ML_AllocRaw32 to ML_AllocRaw and ML_ShrinkRaw32 to ML_ShrinkRaw for similar reasons.


Removed unused flags from the Control structure; most of these came from Control.CG, where roughly 20 out of 60 flags were no longer used.


Split the contraction of primitive operators out of the Contract structure into its own ContractPrim structure.


Split the translation of primops to PLambda out into its own file (compiler/FLINT/trans/transprim.sml).


Fixed regression: Word32.toInt 0wx8002DE32; would return 187954 instead of raising Overflow. The problem was a mistake in the way that the overflow trap was being generated in MLRiscGen.


Some minor primop cleanup.

  • Changed the types of Primop.ROUND and Primop.REAL to take bitwidths, instead of numkinds, since the kinds are always the same. Also, the fields are now called from and to (instead of fromkind and tokind) to be consistent with other conversion primops.

  • Renamed ABS to FABS, since it is only used on floating-point numbers.

  • Renamed the CPS primitive operator ROUND to REAL_TO_INT and the operator REAL to INT_TO_REAL.

  • Renamed the Primop.REAL to Primop.INT_TO_REAL so that it is not confused with the other constructors named REAL.


Improvements to the core 64-bit int and word modules in system/smlnj/init. Replaced Int64.+, Int64.-, Word64.+, and Word64. with versions from *Hacker’s Delight that use fewer conditional branches. Also replaced the relational operators (<, , etc) with more direct implementations.


Fix for bug #213 (Int32.div raises Div instead of Overflow when dividing minInt by ~1). Since the compiler generates an explicit test for division by zero, we know that the only arithmetic traps must be caused by other operations. Therefore, we can just map any arithmetic trap to Overflow.

Also removed the old SPARC assembly code for multiplication and division. The code generator always uses the native hardware instructions, so the assembly code is not needed.


Yet another attempt to get the implementation of use in the REPL working in a sensible way.

With these changes, use should behave as follows. If an invocation of use encounters a compilation error (either in the initial file or in a nested invocation of use), then the compiler error message will be printed and the call to use will immediately return (). If an invocation of use raises an exception during execution of the compiled code (either in the initial file or in a nested invocation of use), then the exception will be reported at the top-level. Any change to the global state or environment that occurs before an error is encountered, will not be rolled back.

Files specified as command-line arguments to the sml command will be treated as if use was invoked on them. If there is an error, then the error will be reported and the sml command will terminate with a non-zero exit status (at least on Unix).

This change fixes bugs #193, #217, and #219. There is a connection between this change and #183, which was fixed in Version 110.82.


Change to the CPS primops: moved the F_SGN operator (which is unary) from the fcmp datatype to the branch datatype (and renamed it FSGN).


Finished conversion of the CPS IR to a form that is compatible with ASDL. Basically, this involved converting the datatype constructor names to upper-case identifiers.

These changes are a step in the plan to eventually switch to an LLVM-based code generator that will be given pickled CPS code as its input.


Starting to migrate the CPS IR toward the ASDL version. Changed the names of the CPS.P.arith and CPS.P.cmpop constructors to be upper-case alpha IDs (many of them were symbolic identifiers). Also split out the various utility functions into the new CPSUtil module (CPS/cps/cps-util.sml). Lastly, moved the literals.sml file from FLINT/main to CPS/main (where it belongs).

Note that the CPS.P.arithop datatype is now identical to the Primop.arithop datatype


Reorganized the backend of the compiler by moving the CPS-related code into its own directory tree (Compiler/CPS) and replacing the FLINTComp functor with the FLINTOpt structure and the CPSCompFn functor. The conversion from FLINT to CPS is part of the CPSCompFn functor, which takes the program representation all the way from FLINT to machine code segments.

Version 110.85; 2018/12/21


Modified config/ to look for a pre-Mojave SDK when trying to install on macOS 10.14 Mojave.


Updated runtime/objs/cygwin.def so that the runtime system will build on 32-bit Cygwin. Also updated installation script to suggest using the 32-bit version of Cygwin when a user tries to install it on Cygwin64.


Xcode 10.1, which is Apple’s development environment for macOS 10.14 Mojave, does not include the libraries needed to build 32-bit executables, such as the SML/NJ runtime, although 32-bit programs will still run.

To support building on Mojave, I added a new Makefile (mk.x86-darwin18) for the runtime system and modified the config/ script to use this makefile when necessary. This new makefile expects that the MacOSX10.13.sdk directory from Xcode 9 has been copied into the Xcode 10 SDKs directory. Note that updating Xcode from the AppStore will likely remove the 10.13 SDK, so you should keep a copy in a safe place.

The Xcode SDKs live in Platforms/MacOSX.platform/Developer/SDKs under the Developer directory. One can determine the path to the current developer directory using the command

% xcode-select -p

Removed several unsupported primitive operators from the compiler. In the CPS IR, these were free, acclink, setpseudo, setmark, and getpseudo. The pseudo-register operations were not supported in the code generator, while the others were no-ops. The corresponding operators GETPSEUDO, SETPSEUDO, SETMARK, and DISPOSE were removed from ElabData/prim/primop.sml and their bindings were removed from Semant/prim/primop-bindings.sml and the InlineT and Unsafe structures.

The AllocProf module in the compiler was also disabled, since it relied on the pseudo registers for recording profile information at runtime. Furthermore, uses of the acclink primitive operation in FLINT/cps/closure.sml when static profiling is enabled were removed.

These changes were committed as revision 4886.


Fix for bug #216 (run-time system fatal error with large top-level value). The problem was in the code for building literals.


Change CPS operators for wrapping/unwrapping integer and float values to be word-size flexible. We now use a single wrap (and unwrap) operator that is parameterized by a numkind value. We also changed the wrap/unwrap operators to box/unbox. The mapping from old operators to new ones is as follows:






wrap(INT defaultIntSz)


unwrap(INT defaultIntSz)


wrap(INT 32)


unwrap(INT 32)


wrap(FLOAT 64)


unwrap(FLOAT 64)


Further cleanup for 64BIT in function atomeq in PEqual. (base/compiler/FLINT/trans/pequal.sml). Added numKind, intEqTy, and uintEqTy functions. The numKind function should be extended once int64 and word64 are treated as primitive types in the compiler.


Fixed 64BIT issue in module MatchComp (base/compiler/FLINT/trans/matchcomp.sml). Added int64Ty and word64Ty cases to function numCon.


Fixed 64BIT issue in module Equal (base/compiler/FLINT/reps/equal.sml). Exports just one function: equal_branch, which is called once in reps/wrapping.sml to type-specialize branches on calls to POLYEQUAL.


The CPS optimizer had a mechanism for checking the CPS against the FLINT types, which required maintaining a mapping from lvars to their FLINT types. This code has long since bit-rotted and cannot even handle a simple expression like 1+2. Therefore, I’ve removed the mapping (a hash table) from the CPS optimizer and the vestigial code that modified it in the various CPS optimization passes.


Modified the InfCnv (now named IntInfCnv) structure to remove 32-bit dependencies.


Modified Pequal (in base/compiler/FLINT/trans/pequal.sml) and Translate (in base/compiler/FLINT/trans/translate.sml) to remove 32-bit dependencies. though further changes will be required to properly handle int64 and word64 types when defaultIntSz = 64.


Reimplemented the Switch module (int base/compiler/FLINT/cps). The new implementation follows the same basic design as before, but the code is better organized and documented, and it now uses the concrete CPS representations, instead of being parameterized over an abstraction of them. It also now uses binary search for boxed (e.g., switches.

Version 110.84; 2018/09/03


Reimplemented the array/vector-slice modules to use a (base, start, length) representation (as does Substring in system/smlnj/init/substring.sml). Also fixed a bug in the slice findi functions, where the index being passed to the predicate function was not adjusted to be slice-relative.


Improved implementation of and CharVectorSlice.mapi to not build intermediate list of results.


A beta-release of ASDL library and asdlgen tool have been added to the system. This version of the tool implements SML support, but the C++ support is not complete. There is a CM tool for ASDL, which recognizes the .asdl file suffix.


Two changes to the installer (base/base/system/smlnj/installer):

  1. The build scripts for programs are now named (instead of build) on Unix systems.

  2. The config action has been added to support module configuration.


Added RENAME extension style to CM tool support. This extension style allows arbitrary file names to be generated from the base name.


Fixed a bug in the implementation of monomorphic buffers: the functions CharBuffer.add1 and Word8Buffer.add1 had an incorrect length test.


Fixed a compiler bug (arg ty lists wrong length) in unifyTy that could occur when one of the type constructors is the ERRORtyc. This bug occurs because the ERRORtyc is equal to any other type constructor, which (incorrectly) implies that the number of type arguments should be equal.

Version 110.83; 2018/06/01


Fixed #206 (Parsing of explicit type variables and val rec is broken). This bug was also bug number 1261 in the old bugs list.


Fixed minor bug in Date.toString (missing leading "0" for day of month). This issue was bug number 1444 in the old bugs list.


Cleaned up match compiler code (FLINT/trans/matchcomp.sml) and added typing and function comments. Added debugging and printing infrastructure, including new FLINT/trans/mcprint.sml file, and new Control.MC.debugging flag.


Fixed parser to allow parentheses around val rec patterns.


Fixed the scanner to produce the correct error message for bad escape sequences in string literals.


Fixed old bug number 1383: Char.toCString #"\000" returned "\\0", instead of "\\000", which caused String.toCString to produce invalid results.


Fix for bug #201 (The library is missing).


Added MONO_BUFFER signature, with instances CharBuffer and Word8Buffer, to Basis implementation ( Basis Library Proposal 2018-001).


Fix a bug where “0w” was being accepted as a prefix for a hexidecimal word value in Word.fromString/scan (ignoring case, only “0x” and “0wx” are valid prefixes). This change fixes bug number 1375 from the old bugs list.


Fixed a bug in the parsing of bindings involving the op keyword. The parser was more restrictive than the definition. This change fixes bug number 1370 from the old bugs list.


The lexer gave an unmatched close comment error on "*)", when it should have scanned it as the tokens "*" ")". This change fixes bug number 330 in the old bugs list.

Note: there is some ambiguity as to what the correct behavior should be here. The Definition of Standard ML (1997) only says that unmatched open comments should be signalled as errors, but the Commentary on the Definition of Standard ML (1991) says otherwise in Appendix D. SML/NJ started signalling an error in version 0.71, but we choose to revert to accepting this sequence, to match the 1997 Definition (and the behavior of other systems).


The sameSign function returned incorrect results in the Int31 and Int32 modules.


Fixed various minor parsing and scanning issues:

  • correct syntax for type variables

  • signature/structure/functor IDs should always be alpha IDs

  • the equality ID (=) cannot appear in a binding context. Note that we still allow the syntax val op = = …​ because it is needed to parse the file system/smlnj/init/built-in.sml.


Completed overhaul of the way that int/word literals are handled in the compiler. We now use to represent the values in all IRs. This change also results in better CPS contraction, since we now perform constant folding for both signed and unsigned values at all sizes. We were also able to get rid of the tricky code that worries about large tagged integer values that might cause overflow during code generation.


Improved the reporting of errors involving literal values. We now use the original source text when describing the value in the error message.


Fix for bug #191 (Compiler crash when handling large reals). We now issue a warning for real literals that will round to zero and an error for real literals that are too large to represent. There still needs to be done some work to support sub-normal literal values (these are currently rounded to zero).


Changed the representation of real literals from strings to RealLit.t.


Removed real patterns from Absyn and FLINT, since they are not allowed by SML'93 and were not present in the AST representation.


Fix for bug #194 (Real.fromString overflows or hangs). There were two issues here. First, the Overflow exception was being raised when scanning large exponents, but it was not being handled by the scanning code. The second issue was that the scaling loop for large exponents did not immediately terminate once infinity (or zero) was reached, so it could take a long time.


Moved the Version-1 literal building code into gc/old-literals.c. This file can be removed once the compiler generates the Version-2 literal bytecode.


Moved the check for whether a int or word literal is in range for its type from the absyn→plambda translation to the overload resolver (compiler/Elaborator/types/overload.sml).


Part 1 of an overhaul of the way that the compiler treats int/word literals. The end goal is to use to represent literals throughout all phases of the compiler. In this step, we changed the representation of literals in the Absyn representation (earlier representations already used

Version 110.82; 2017/10/16


Fixed unnumbered bug in IntInf.mod and IntInf.rem functions, where the Div exception was not getting raised when both arguments are 0.


Various bits of cleanup in the handling of primitive operations, such as removing the ptnum mechanism for translating from Absyn to FLINT.


Added Target module, which specifies the properties of the target (e.g., the size in bits of the default int type). Reworked the generation of the InlineT structure to be target specific.


Removed FLINT primops (and their CPS counterparts) that are not in the InlineT structure and, thus, are never used by the compiler.


Fixed bug #123 (missing nonexhaustive bind warning). The mkVBs function in FLINT/trans/translate.sml was adding a redundant default rule by calling ElabUtil.completeMatch after a default rule had already been explicitly added to the match for let bindings.


Fixed bug #183 (status code returned by sml REPL). This fix restores the version 110.79 behavior of having sml foo.sml exit with a non-zero status when there is a type-checking error in foo.sml. It also cleans up the error messages associated with use when there is a syntax error.


Fixed bug #185 (Bring command line help text into parity with man page). Added missing options (@SMLversion and @SMLsuffix) to the help message that is printed for the command “sml -h”. Also adjusted the order of options in the help message, and in the man page, so that the orders match.


Changed the way that we test for allocation-space addresses in minor GCs. Instead of using the BIBOP, we now do a pointer range test. On 32-bit systems, this change results in a small (~0.13%) performance boost, but we expect a bigger impact on 64-bit hardware, where the cost of BIBOP probes will be higher and there are more registers available to hold the nursery bounds.


Fixed some issues in build-literals.c. These were mostly false positives in the assertions, but there was also a bug in the way that the available space was tracked that could conceivably result in a crash (but was very unlikely).


Updated _arch-n-os script to recognize macOS 10.13 (High Sierra) as a valid target.


Fixed a bug in the way that JSON string values were being printed. The code previously assumed that C-style escaping will work, but that is not true for "\'" (as well as for control and non-ASCII characters). The new implementation assumes that the string value is UTF-8 and uses the "\\u" escape sequences for characters outside the JSON escapes and printable ASCII characters.

Version 110.81; 2017/05/01


Fixed bug #129 (Symbolic identifiers are allowed as strids).


Fixed bug #179 (ml-ulex writing debug messages to stdOut). Both ml-ulex and ml-antlr now direct their debug and status messages to stdErr (instead of stdOut).


Linux distributions are starting to require that the stack be marked as non-executable in applications. Because the runtime system includes assembly code, this marking was not happening. We’ve added .section directives to the PPC.prim.asm and X86.prim.asm files as per Thanks to Daniel Moerner for reporting this issue and for providing a pointer to the fix.


Added --debug command-line option to ml-antlr. This flag causes <b>ml-antlr</b> to generate debug actions that print the left-hand-side non-terminal of the production.


Working on 64-bit support. Changes include making code generation dependent on the target word size and abstracting over the BIBOP representation in the runtime system.


Further cleanup of the separation of FLINT from the front-end. Eliminated all references to ModulePropLists (module-plists.sml) in the front end and in pickling, and moved module-plists from Semant/modules to FLINT/trans. ModulePropLists is now only used in FLINT/trans/translate.sml.

Revision: 4314

Files changed:

  • compiler/ElabData/modules/modules.sml (cleaned up)

  • compiler/Elaborator/print/ppmod.sml (cleaned up)

  • compiler/FLINT/trans/module-plists.sml (moved from Semant/modules)

  • compiler/Semant/modules/instantiate-param.sml (deleted)

  • compiler/Semant/pickle/pickmod.sml (no longer mentions property lists)

  • compiler/Semant/pickle/unpickmod.sml (ditto)

  • compiler/Semant/statenv/prim.sml

  • compiler/Semant/types/tp-var-info.sml (deleted)

  • compiler/ (modified for move of module-plists.sml)


Eliminated dependency of PlambdaType from the front end by adding a type TKind.tkind which is a simplified standin for PlambdaType.tkind for use during elaboration. TKind.tkind values are translated on demand to PlambdaType.tkind in trans/transtypes.sml. Types still has a tycpath type but it is defined using TKind.tkind now. The new structure SigPropList replaces ModulePropLists (Semant/modules/module-plists.sml) for use in instantiate.sml. Instantiate is now defined directly as a structure so the functor application in Semant/modules/instantiate.sml no longer exists.

Files changed:

  • ElabData/basics/debindex.sig (moved here from Elaborator/basics)

  • ElabData/basics/debindex.sml (ditto)

  • ElabData/basics/sig-plist.sml (new)

  • ElabData/basics/tkind.sml (new)

  • ElabData/types/types.sig

  • ElabData/types/types.sml

  • Elaborator/modules/instantiate.sml

  • Elaborator/print/ppmod.sml

  • FLINT/trans/transtkind.sml (new)

  • FLINT/trans/transtypes.sml

  • TopLevel/interact/evalloop.sml

  • ElabData/

  • Elaborator/



Added support for Successor ML record-expression-punning syntax. For example, one can now define a function f as

fun f x = {x}

which is equivalent to the definition

fun f x = {x = x}

Fixed a bug in the parser. Asterix (*) was not allowed as a record label when using the record-pattern-punning syntax.


Added support for do exp Successor ML syntax.


Fixed bug #153 (Enabling Successor ML features is delayed). We now use a function Control.setSuccML to switch to/from Successor ML mode in the REPL. The function resets the parser, so the next input will be correctly parsed. The Control.succML flag is no longer visibile in the REPL.


Fixed bug #149 (Datatype replication exposes hidden constructors). Added boolean field stripped to DATATYPE variant of tyckind in compiler/ElabData/types/types.sml with default value false. stripped is set to true when a datatype is matched with a simple type spec in signature matching, and datatypes with stripped set to true are disallowed in datatype replications.

Files changed:

  • compiler/ElabData/types/types.sig

  • compiler/ElabData/types/types.sml

  • compiler/ElabData/types/typesutil.sml

  • compiler/ElabData/types/core-basictypes.sml

  • compiler/Elaborator/types/basictypes.sml

  • compiler/Elaborator/types/eqtypes.sml

  • compiler/Elaborator/modules/evalent.sml

  • compiler/Elaborator/modules/sigmatch.sml

  • compiler/Elaborator/modules/instantiate.sml

  • compiler/Elaborator/print/ppabsyn.sml

  • compiler/Elaborator/print/pptype.sml

  • compiler/Elaborator/elaborate/elabcore.sml

  • compiler/Elaborator/elaborate/elabmod.sml

  • compiler/Elaborator/elaborate/elabtype.sml

  • compiler/Elaborator/elaborate/elabsig.sml

  • compiler/Semant/pickle/pickmod.sml

  • compiler/Semant/pickle/unpickmod.sml

  • compiler/MiscUtil/print/ppobj.sml

  • compiler/FLINT/trans/transtypes.sml

  • compiler/FLINT/trans/pequal.sml


Added %tokentype directive to ml-antlr; this directive allows users to specify the token datatype externally, which is necessary in order to share a lexer with two different ml-antlr parsers.


Change the interface to AMD64Gen in MLRISC; the signBit and negateSignBit callbacks now return an MLTree.rexp (instead of a label).

Version 110.80; 2016/08/19


Fixed #151 (Error installing from source on Mac OS X). The fix involves both changes to the config/ script and the mk.x86-darwin makefile. With this fix, we include the SDK argument to the /usr/bin/as only when the OS version is 10.10 (Yosemite) or later.


Added the proposed unzipMap, unzipMapi, find, and findi functions to the ListPair module.


Added the proposed mapLeft, mapRight, appLeft, and appRight functions to the Either module.


Fixed bug #145 (Internal exception occurs on bogus annotation instead of typechecking diagnostic). Added missing OVLD_UB case in function failMessage in compiler/Elaborator/types/unify.sml.


Fixed bug #166 (Can’t install SML/NJ in directories containing spaces). Thanks to Eugene Sharygin for the patch.


Fixed incorrect dividend sign extension before 32-bit divide in amd64 code generator in MLRISC


Fixed bug #150 (Add title to batch script).


Implemented the changes for Basis Library Proposal 2016-001. This proposal added the popCount function to the WORD signature.


Fixed bug #156 (sml resumes after SIGSTOP with bogus exception report). The fix is a bit of a hack: I modified the non_bt_hdl function in evalloop.sml to match an IO.Io exception with the appropriate shape for this situation.


Fixed bug #154 (Return code for ml-ulex when there is an error).


Fixed bug #155 (Misleading printing of word literals in error messages).


Fixed a bug in the implementation of the --ml-lex-mode flag for ml-ulex. The \h escape sequence is supposed to map to the character range [\128-\255], but did not.


Fixed bug #147 (Hexadecimal escapes in strings are not supported). We previously did not support Unicode escapes in string literals. We now do so, with non-ascii codepoints being mapped to the UTF-8 encoding with escape values in the range 0..255 being mapped to the corresponding 8-bit character. Values outside that range are flagged as an error.

Revised August 4, 2016


Partial fix for the noisy exception-stack traces on the Error exception. The cases that are handled by this change are applying use to a non-existent file and when there are compilation errors in a program being built by CM.make. What remains to be handled is the situation where CM.make is applied to a non-existent file.

Version 110.79; 2015/10/04


Patched base/compiler/FLINT/clos/closure.sml so that Twelf will build again. Fixes bug #140 (Lookup failure in closure.sml when compiling Twelf).


Added support for a Successor ML tool to CM. This tool allows one to specify that a source file fool.sml is Successor ML source code in the following ways:

foo.sml : succ-ml
foo.sml : sml (succ-ml)
foo.sml (succ-ml)

Added the directory base/old-basis to support backward-compatible views of the Basis Library. You can use these by replacing the line




in your CM files.


New implementation of Date structure in the Basis, which fixes bugs #138 (Incorrect behavior for Date.fromTimeLocal) and #139 ( is broken). Note that some more thought should be given to the correct semantics of when dealing with offsets. For example, should an offset of +23 hours produce the same date as an offset of -1 hours? Currently our implementation produces different results (by a day) for these two situations.


Implemented the changes for Basis Library Proposal 2015-003. This proposal added operations to the following signatures:

signature ARRAY
signature LIST
signature LIST_PAIR
signature MONO_ARRAY
signature MONO_VECTOR
signature OPTION
signature STRING
signature TEXT
signature VECTOR

and the following structures:

structure Array : ARRAY
structure CharArray : MONO_ARRAY
structure CharVector : MONO_VECTOR
structure List : LIST
structure ListPair : LIST_PAIR
structure Option : OPTION
structure Real64Array : MONO_ARRAY
structure Real64Vector : MONO_VECTOR
structure String : STRING
structure Text : TEXT
structure Vector : VECTOR
structure Word8Array : MONO_ARRAY
structure Word8Vector : MONO_VECTOR

While it is very unlikely that these changes will break existing code, there are a a couple scenarios in which the code might break. Namely, when use of open introduces conflicts and when user code implements one of the affected Basis Library signatures. Both of these examples occurred in the SML/NJ source code; the former in the ml-yacc sources and the latter in the MLRISC sources.


Added the optional implementations of PackReal64Big and PackReal64Little. This addition addresses feature request #82 (Implementations of PACK_REAL missing). The implementation uses the approach suggested by Michael Sullivan.


Fixed bug #45 (Compiler bug in specialize phase). This bug was in compiler/FLINT/opt/fcontract.sml and was the result of a bad interaction between eta contraction and inlining. As part of the fix, I cleaned up the code in this part of FLINT a bit.


Improvements to the error messages produced by the ml-ulex lexer generator.


Added Ref structure and REF signature to Basis implementation ( Basis Library Proposal 2015-007).


Added Fn structure and FN signature to Basis implementation ( Basis Library Proposal 2015-005).


Fixed bug #136 (Incorrect raising of exceptions in Real.fmt and Time.fmt).


Added Either structure and EITHER signature to Basis implementation ( Basis Library Proposal 2015-002).


Fixed bug #135 (Fails to build on Linux PowerPC).


Added Linux 4.* kernels to the list of operating systems recognized by the .arch-n-opsys script (fixes bug #134).


Added Mac OS X 10.11 (El Capitan) to the list of operating systems recognized by the .arch-n-opsys script.


Added support for Successor ML lexical extensions. These can be enabled using the command-line option -Cparser.succ-ml=true or by the assignment

Control.succML := true;

at the REPL. The extensions are as follows:

  • Underscore (“_”) as a separator in numeric literals; e.g., 123_456, 0wxff_ff_ff_f3, 123_456.1, …​

  • end-of-line comments, which are denoted using (*). End-of-line comments properly nest into conventional block comments. For example, the following block comment is well formed:

    fun f x = x (*) my identity function *)
  • binary literals for both integers and words; e.g., 0b0101_1110, or 0wb1101.

This change is the beginning of a program to add Successor ML feature to SML/NJ; See for more details.