TBBMalloc

Another good alternative for a memory manager is Intel's TBBMalloc, part of their Thread Building Blocks library. It is released under the Apache 2.0 license which allows usage in open source and commercial applications. You can also buy a commercial license, if the Apache 2.0 license doesn't suite you. As ScaleMM, this memory manager only supports the Windows platform.

Intel's library was designed for C and C++ users. While you can go and download it from https://www.threadingbuildingblocks.org, you will still need a Delphi unit that will connect Delphi's memory management interface with the functions exported from Intel's tbbmalloc DLL.

Alternatively, you can download the Intel TBBMalloc Interfaces project from https://sites.google.com/site/aminer68/intel-tbbmalloc-interfaces-for-delphi-and-delphi-xe-versions-and-freepascal. The package you'll get there contains both compiled tbbmalloc.dll and a unit cmem which links this DLL into your application.

In order to be able to run the application, you'll have to make sure that it will find tbbmalloc.dll. The simplest way to do that is to put the DLL in the exe folder.  For 32-bit applications, you should use the DLL from the  tbbmalloc32  subfolder and for 64-bit applications, you should use the DLL from the  tbbmalloc64 subfolder.

Intel TBBMalloc Interfaces actually implements three different interface units, all named cmem. A version in subfolder cmem_delphi can be used with Delphis up to 2010, subfolder cmem_xe contains a version designed for Delphi XE and newer, and there's also a version for FreePascal in subfolder cmem_fps.

Creating an interface unit to a DLL is actually pretty simple. The cmem unit firstly imports functions from the tbbmalloc.dll:

function scalable_getmem(Size: nativeUInt): Pointer; cdecl;
external 'tbbmalloc' name 'scalable_malloc';

procedure scalable_freemem(P: Pointer); cdecl;
external 'tbbmalloc' name 'scalable_free';

function scalable_realloc(P: Pointer; Size: nativeUInt): Pointer; cdecl;
external 'tbbmalloc' name 'scalable_realloc';

After that, writing memory management functions is a breeze. You just have to redirect calls to the DLL functions:

function CGetMem(Size: NativeInt): Pointer;
begin
Result := scalable_getmem(Size);
end;

function CFreeMem(P: Pointer): integer;
begin
scalable_freemem(P);
Result := 0;
end;

function CReAllocMem(P: Pointer; Size: NativeInt): Pointer;
begin
Result := scalable_realloc(P, Size);
end;

The AllocMem function is implemented by calling the scallable_getmem DLL function and filling the memory with zeroes afterwards:

function CAllocMem(Size : NativeInt) : Pointer;
begin
Result := scalable_getmem(Size);
if Assigned(Result) then
FillChar(Result^, Size, 0);
end;

The ParallelAllocation demo is all set for testing with TBBMalloc. You only have to copy tbbmalloc.dll from the tbbmalloc32 folder to Win32\Debug and change the ParallelAllocation.dpr so that it will load cmem instead of FastMM or ScaleMM:

program ParallelAllocation;

uses
cmem in 'tbbmalloc\cmem_xe\cmem.pas',
Vcl.Forms,
ParallelAllocationMain in 'ParallelAllocationMain.pas' {frmParallelAllocation};

What about the speed? Unbeatable! Running the ParallelAllocation demo with TBBMalloc shows parallel code being up to 8.6x faster than the serial code: