JAVA面试50讲之10：直接(堆外)内存原理及使用

一、堆外内存源码理解

HeapByteBuffer是堆内ByteBuffer，使用byte[]存储数据，是对数组的封装，比较简单。DirectByteBuffer是堆外ByteBuffer，直接使用堆外内存空间存储数据，是NIO高性能的核心设计之一。本文来分析一下DirectByteBuffer的实现。

如何使用DirectByteBuffer

如果需要实例化一个DirectByteBuffer，可以使用java.nio.ByteBuffer#allocateDirect这个方法：

public static ByteBuffer allocateDirect(int capacity) {
    return new DirectByteBuffer(capacity);
}

DirectByteBuffer实例化流程

我们来看一下DirectByteBuffer是如何构造，如何申请与释放内存的。先看看DirectByteBuffer的构造函数：

DirectByteBuffer(int cap) {                   // package-private
    // 初始化Buffer的四个核心属性
    super(-1, 0, cap, cap);
    // 判断是否需要页面对齐，通过参数-XX:+PageAlignDirectMemory控制，默认为false
    boolean pa = VM.isDirectMemoryPageAligned();
    int ps = Bits.pageSize();
    // 确保有足够内存
    long size = Math.max(1L, (long)cap + (pa ? ps : 0));
    Bits.reserveMemory(size, cap);
 
    long base = 0;
    try {
        // 调用unsafe方法分配内存
        base = unsafe.allocateMemory(size);
    } catch (OutOfMemoryError x) {
        // 分配失败，释放内存
        Bits.unreserveMemory(size, cap);
        throw x;
    }
    // 初始化内存空间为0
    unsafe.setMemory(base, size, (byte) 0);
    // 设置内存起始地址
    if (pa && (base % ps != 0)) {
        address = base + ps - (base & (ps - 1));
    } else {
        address = base;
    }
    // 使用Cleaner机制注册内存回收处理函数
    cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
    att = null;
}

申请内存前会调用java.nio.Bits#reserveMemory判断是否有足够的空间可供申请：


// 该方法主要用于判断申请的堆外内存是否超过了用例指定的最大值
// 如果还有足够空间可以申请，则更新对应的变量
// 如果已经没有空间可以申请，则抛出OOME
// 参数解释：
//     size：根据是否按页对齐，得到的真实需要申请的内存大小
//     cap：用户指定需要的内存大小(<=size)
static void reserveMemory(long size, int cap) {
    // 因为涉及到更新多个静态统计变量，这里需要Bits类锁
    synchronized (Bits.class) {
        // 获取最大可以申请的对外内存大小，默认值是64MB
        // 可以通过参数-XX:MaxDirectMemorySize=<size>设置这个大小
        if (!memoryLimitSet && VM.isBooted()) {
            maxMemory = VM.maxDirectMemory();
            memoryLimitSet = true;
        }
        // -XX:MaxDirectMemorySize限制的是用户申请的大小，而不考虑对齐情况
        // 所以使用两个变量来统计：
        //     reservedMemory：真实的目前保留的空间
        //     totalCapacity：目前用户申请的空间
        if (cap <= maxMemory - totalCapacity) {
            reservedMemory += size;
            totalCapacity += cap;
            count++;
            return; // 如果空间足够，更新统计变量后直接返回
        }
    }
 
    // 如果已经没有足够空间，则尝试GC
    System.gc();
    try {
        Thread.sleep(100);
    } catch (InterruptedException x) {
        // Restore interrupt status
        Thread.currentThread().interrupt();
    }
    synchronized (Bits.class) {
        // GC后再次判断，如果还是没有足够空间，则抛出OOME
        if (totalCapacity + cap > maxMemory)
            throw new OutOfMemoryError("Direct buffer memory");
        reservedMemory += size;
        totalCapacity += cap;
        count++;
    }
}

在java.nio.Bits#reserveMemory方法中，如果空间不足，会调用System.gc()尝试释放内存，然后再进行判断，如果还是没有足够的空间，抛出OOME。

如果分配失败，则需要把预留的统计变量更新回去：

static synchronized void unreserveMemory(long size, int cap) {
    if (reservedMemory > 0) {
        reservedMemory -= size;
        totalCapacity -= cap;
        count--;
        assert (reservedMemory > -1);
    }
}

从上面几个函数中我们可以得到信息：

可以通过-XX:+PageAlignDirectMemor参数控制堆外内存分配是否需要按页对齐，默认不对齐。
每次申请和释放需要调用调用Bits的reserveMemory或unreserveMemory方法，这两个方法根据内部维护的统计变量判断当前是否还有足够的空间可供申请，如果有足够的空间，更新统计变量，如果没有足够的空间，调用System.gc()尝试进行垃圾回收，回收后再次进行判断，如果还是没有足够的空间，抛出OOME。
Bits的reserveMemory方法判断是否有足够内存不是判断物理机是否有足够内存，而是判断JVM启动时，指定的堆外内存空间大小是否有剩余的空间。这个大小由参数-XX:MaxDirectMemorySize=<size>设置。
确定有足够的空间后，使用sun.misc.Unsafe#allocateMemory申请内存
申请后的内存空间会被清零
DirectByteBuffer使用Cleaner机制进行空间回收
可以看出除了判断是否有足够的空间的逻辑外，核心的逻辑是调用sun.misc.Unsafe#allocateMemory申请内存，我们看一下这个函数是如何申请对外内存的：

// 申请一块本地内存。内存空间是未初始化的，其内容是无法预期的。
// 使用freeMemory释放内存，使用reallocateMemory修改内存大小
public native long allocateMemory(long bytes);



// openjdk8/hotspot/src/share/vm/prims/unsafe.cpp
UNSAFE_ENTRY(jlong, Unsafe_AllocateMemory(JNIEnv *env, jobject unsafe, jlong size))
  UnsafeWrapper("Unsafe_AllocateMemory");
  size_t sz = (size_t)size;
  if (sz != (julong)size || size < 0) {
    THROW_0(vmSymbols::java_lang_IllegalArgumentException());
  }
  if (sz == 0) {
    return 0;
  }
  sz = round_to(sz, HeapWordSize);
  // 调用os::malloc申请内存，内部使用malloc函数申请内存
  void* x = os::malloc(sz, mtInternal);
  if (x == NULL) {
    THROW_0(vmSymbols::java_lang_OutOfMemoryError());
  }
  //Copy::fill_to_words((HeapWord*)x, sz / HeapWordSize);
  return addr_to_java(x);
UNSAFE_END

可以看出sun.misc.Unsafe#allocateMemory使用malloc这个C标准库的函数来申请内存。

DirectByteBuffer回收流程

在DirectByteBuffer的构造函数的最后，我们看到了这样的语句：

// 使用Cleaner机制注册内存回收处理函数
cleaner = Cleaner.create(this, new Deallocator(base, size, cap));

这是使用Cleaner机制进行内存回收。因为DirectByteBuffer申请的内存是在堆外，DirectByteBuffer本身支持保存了内存的起始地址而已，所以DirectByteBuffer的内存占用是由堆内的DirectByteBuffer对象与堆外的对应内存空间共同构成。堆内的占用只是很小的一部分，这种对象被称为冰山对象。

堆内的DirectByteBuffer对象本身会被垃圾回收正常的处理，但是对外的内存就不会被GC回收了，所以需要一个机制，在DirectByteBuffer回收时，同时回收其堆外申请的内存。

Java中可选的特性有finalize函数，但是finalize机制是Java官方不推荐的，官方推荐的做法是使用虚引用来处理对象被回收时的后续处理工作，可以参考JDK源码阅读-Reference。同时Java提供了Cleaner类来简化这个实现，Cleaner是PhantomReference的子类，可以在PhantomReference被加入ReferenceQueue时触发对应的Runnable回调。

DirectByteBuffer就是使用Cleaner机制来实现本身被GC时，回收堆外内存的能力。我们来看一下其回收处理函数是如何实现的：


private static class Deallocator
    implements Runnable
    {
 
        private static Unsafe unsafe = Unsafe.getUnsafe();
 
        private long address;
        private long size;
        private int capacity;
 
        private Deallocator(long address, long size, int capacity) {
            assert (address != 0);
            this.address = address;
            this.size = size;
            this.capacity = capacity;
        }
 
        public void run() {
            if (address == 0) {
                // Paranoia
                return;
            }
            // 使用unsafe方法释放内存
            unsafe.freeMemory(address);
            address = 0;
            // 更新统计变量
            Bits.unreserveMemory(size, capacity);
        }
 
    }
        ~~~
sun.misc.Unsafe#freeMemory方法使用C标准库的free函数释放内存空间。同时更新Bits类中的统计变量。

## DirectByteBuffer读写逻辑

public ByteBuffer put(int i, byte x) {
unsafe.putByte(ix(checkIndex(i)), ((x)));
return this;
}

public byte get(int i) {
return ((unsafe.getByte(ix(checkIndex(i)))));
}

private long ix(int i) {
return address + (i << 0);
}

DirectByteBuffer使用sun.misc.Unsafe#getByte(long)和sun.misc.Unsafe#putByte(long, byte)这两个方法来读写堆外内存空间的指定位置的字节数据。不过这两个方法本地实现比较复杂，这里就不分析了。

## 默认可以申请的堆外内存大小
上文提到了DirectByteBuffer申请内存前会判断是否有足够的空间可供申请，这个是在一个指定的堆外大小限制的前提下。用户可以通过-XX:MaxDirectMemorySize=<size>这个参数来控制可以申请多大的DirectByteBuffer内存。但是默认情况下这个大小是多少呢？

DirectByteBuffer通过sun.misc.VM#maxDirectMemory来获取这个值，可以看一下对应的代码：

// A user-settable upper limit on the maximum amount of allocatable direct
// buffer memory. This value may be changed during VM initialization if
// "java" is launched with "-XX:MaxDirectMemorySize=<size>".
//
// The initial value of this field is arbitrary; during JRE initialization
// it will be reset to the value specified on the command line, if any,
// otherwise to Runtime.getRuntime().maxMemory().
//
private static long directMemory = 64 * 1024 * 1024;

// Returns the maximum amount of allocatable direct buffer memory.
// The directMemory variable is initialized during system initialization
// in the saveAndRemoveProperties method.
//
public static long maxDirectMemory() {
return directMemory;
}

这里directMemory默认赋值为64MB，那对外内存的默认大小是64MB吗？不是，仔细看注释，注释中说，这个值会在JRE启动过程中被重新设置为用户指定的值，如果用户没有指定，则会设置为Runtime.getRuntime().maxMemory()。

这个过程发生在sun.misc.VM#saveAndRemoveProperties函数中，这个函数会被java.lang.System#initializeSystemClass调用：

public static void saveAndRemoveProperties(Properties props) {
if (booted)
throw new IllegalStateException("System initialization has completed");

savedProps.putAll(props);

// Set the maximum amount of direct memory.  This value is controlled
// by the vm option -XX:MaxDirectMemorySize=<size>.
// The maximum amount of allocatable direct buffer memory (in bytes)
// from the system property sun.nio.MaxDirectMemorySize set by the VM.
// The system property will be removed.
String s = (String)props.remove("sun.nio.MaxDirectMemorySize");
if (s != null) {
    if (s.equals("-1")) {
        // -XX:MaxDirectMemorySize not given, take default
        directMemory = Runtime.getRuntime().maxMemory();
    } else {
        long l = Long.parseLong(s);
        if (l > -1)
            directMemory = l;
    }
}

//...

}

所以默认情况下，可以申请的DirectByteBuffer大小为Runtime.getRuntime().maxMemory()，而这个值等于可用的最大Java堆大小，也就是我们-Xmx参数指定的值。

所以最终结论是：默认情况下，可以申请的最大DirectByteBuffer空间为Java最大堆大小的值。

和DirectByteBuffer有关的JVM选项
根据上文的分析，有两个JVM参数与DirectByteBuffer直接相关：

-XX:+PageAlignDirectMemory：指定申请的内存是否需要按页对齐，默认不对其
-XX:MaxDirectMemorySize=<size>，可以申请的最大DirectByteBuffer大小，默认与-Xmx相等

## 二、堆外内存详细使用

下面主要讲解如何使用直接内存（堆外内存），并按照下面的步骤进行说明：

相关背景-->读写操作-->关键属性-->读写实践-->扩展-->参考说明

希望对想使用直接内存的朋友，提供点快捷的参考。

## 数据类型
下面这些，都是在使用DirectBuffer中必备的一些常识，暂作了解吧！如果想要深入理解，可以看看下面参考的那些博客。

## 基本类型长度
在Java中有很多的基本类型，比如：

byte，一个字节是8位bit，也就是1B
short，16位bit，也就是2B
int，32位bit，也就是4B
long, 64位bit，也就是8B
char，16位bit，也就是2B
float，32位bit，也就是4B
double，64位bit，也就是8B
不同的类型都会按照自己的位数来存储，并且可以自动进行转换提升。
byte、char、short都可以自动提升为int，如果操作数有long，就会自动提升为long，float和double也是如此。

## 大端小端
由于一个数据类型可能有很多个字节组成的，那么它们是如何摆放的。这个是有讲究的：

大端：低地址位 存放 高有效字节
小端：低地址位 存放 低有效字节
举个例子，一个char是有两个字节组成的，这两个字节存储可能会显示成如下的模样，比如字符a:

          低地址位    高地址位

大端； 00 96
小端： 96 00

## String与new String的区别
再说说"hello"和new String("hello")的区别：

如果是"hello"，JVM会先去共享的字符串池中查找，有没有"hello"这个词，如果有直接返回它的引用；如果没有，就会创建这个对象，再返回。因此，"a"+"b"相当于存在3个对象，分别是"a"、"b"、"ab"。

而new String("hello")，则省去了查找的过程，直接就创建一个hello的对象，并且返回引用。

## 读写数据
在直接内存中，通过allocateDirect(int byte_length)申请直接内存。这段内存可以理解为一段普通的基于Byte的数组，因此插入和读取都跟普通的数组差不多。

只不过提供了基于不同数据类型的插入方法，比如：

put(byte) 插入一个byte
put(byte[]) 插入一个byte数组
putChar(char) 插入字符
putInt(int) 插入Int
putLong(long) 插入long
等等


## 基本的属性值
它有几个关键的指标：

mark-->position-->limit-->capacity
另外,还有remaining=limit-position。

先说说他们的意思吧！

## 当前位置——position
position是当前数组的指针，指示当前数据位置。举个例子：

ByteBuffer buffer = ByteBuffer.allocateDirect(1024);
buffer.putChar('a');
System.out.println(buffer);
buffer.putChar('c');
System.out.println(buffer);
buffer.putInt(10);
System.out.println(buffer);

由于一个char是2个字节，一个Int是4个字节，因此position的位置分别是:

2,4,8

注意，Position的位置是插入数据的当前位置，如果插入数据，就会自动后移。
也就是说，如果存储的是两个字节的数据，position的位置是在第三个字节上，下标就是2。

java.nio.DirectByteBuffer[pos=2 lim=1024 cap=1024]
java.nio.DirectByteBuffer[pos=4 lim=1024 cap=1024]
java.nio.DirectByteBuffer[pos=8 lim=1024 cap=1024]

position可以通过position()获得，也可以通过position(int)设置。

//position(int)方法的源码
public final Buffer position(int newPosition) {
if ((newPosition > limit) || (newPosition < 0))
throw new IllegalArgumentException();
position = newPosition;
if (mark > position) mark = -1;
return this;
}
~~~
注意：position的位置要比limit小，比mark大

空间容量——capacity

capacity是当前申请的直接内存的容量，它是申请后就不会改变的。

capacity则可以通过capacity()方法获得。

限制大小——limit

我们可能想要改变这段直接内存的大小，因此可以通过一个叫做Limit的属性设置。

limit则可以通过limit()获得，通过limit(int)进行设置。
注意limit要比mark和position大，比capacity小。

//limit(int)方法的源码
public final Buffer limit(int newLimit) {
        if ((newLimit > capacity) || (newLimit < 0))
            throw new IllegalArgumentException();
        limit = newLimit;
        if (position > limit) position = limit;
        if (mark > limit) mark = -1;
        return this;
    }
        ~~~
## 标记位置——mark
mark，就是一个标记为而已，记录当前的position的值。常用的场景，就是记录某一次插入数据的位置，方便下一次进行回溯。

可以使用mark()方法进行标记，
使用reset()方法进行清除，
使用rewind()方法进行初始化

//mark方法标记当前的position,默认为-1
public final Buffer mark() {
mark = position;
return this;
}
//reset方法重置mark的位置，position的位置，不能小于mark的位置，否则会出错
public final Buffer reset() {
int m = mark;
if (m < 0)
throw new InvalidMarkException();
position = m;
return this;
}
//重置mark为-1.position为0
public final Buffer rewind() {
position = 0;
mark = -1;
return this;
}

## 使用案例

ByteBuffer buffer = ByteBuffer.allocateDirect(1024);
buffer.putChar('a');
buffer.putChar('c');
System.out.println("插入完数据 " + buffer);
buffer.mark();// 记录mark的位置
buffer.position(30);// 设置的position一定要比mark大，否则mark无法重置
System.out.println("reset前 " + buffer);
buffer.reset();// 重置reset ，reset后的position=mark
System.out.println("reset后 " + buffer);
buffer.rewind();//清除标记，position变成0，mark变成-1
System.out.println("清除标记后 " + buffer);

可以看到如下的运行结果：

插入完数据 java.nio.DirectByteBuffer[pos=4 lim=1024 cap=1024]
reset前 java.nio.DirectByteBuffer[pos=30 lim=1024 cap=1024]
reset后 java.nio.DirectByteBuffer[pos=4 lim=1024 cap=1024]
清除标记后 java.nio.DirectByteBuffer[pos=0 lim=1024 cap=1024]

## 剩余空间——remaing
remaing则表示当前的剩余空间：

public final int remaining() {
return limit - position;
}
~~~

读写实践

写操作主要就是按照自己的数据类型，写入到直接内存中，注意每次写入数据的时候，position都会自动加上写入数据的长度，指向下一个该写入的起始位置：

下面看看如何写入一段byte[]或者字符串：

ByteBuffer buffer = ByteBuffer.allocateDirect(10);
byte[] data = {1,2};
buffer.put(data);
System.out.println("写byte[]后 " + buffer);
buffer.clear();
buffer.put("hello".getBytes());
System.out.println("写string后 " + buffer);

输出的内容为:

写byte[]后 java.nio.DirectByteBuffer[pos=2 lim=10 cap=10]
写string后 java.nio.DirectByteBuffer[pos=5 lim=10 cap=10]

读的时候，可以通过一个外部的byte[]数组进行读取。由于没有找到直接操作直接内存的方法: 因此如果想在JVM应用中使用直接内存，需要申请一段堆中的空间，存放数据。

ByteBuffer buffer = ByteBuffer.allocateDirect(10);
buffer.put(new byte[]{1,2,3,4});
System.out.println("刚写完数据 " +buffer);
buffer.flip();
System.out.println("flip之后 " +buffer);
byte[] target = new byte[buffer.limit()];
buffer.get(target);//自动读取target.length个数据
for(byte b : target){
    System.out.println(b);
}
System.out.println("读取完数组 " +buffer);

输出为

刚写完数据 java.nio.DirectByteBuffer[pos=4 lim=10 cap=10]
flip之后 java.nio.DirectByteBuffer[pos=0 lim=4 cap=10]
1
2
3
4

读取完数组 java.nio.DirectByteBuffer[pos=4 lim=4 cap=10]

常用方法

上面的读写例子中，有几个常用的方法：

clear()

这个方法用于清除mark和position，还有limit的位置：

public final Buffer clear() {
        position = 0;
        limit = capacity;
        mark = -1;
        return this;
    }
        ~~~
#### flip()
这个方法主要用于改变当前的Position为limit，主要是用于读取操作。

public final Buffer flip() {
limit = position;
position = 0;
mark = -1;
return this;
}

#### compact()
这个方法在读取一部分数据的时候比较常用。
它会把当前的Position移到0，然后position+1移到1。

public ByteBuffer compact() {
int pos = position();
int lim = limit();
assert (pos <= lim);
int rem = (pos <= lim ? lim - pos : 0);

unsafe.copyMemory(ix(pos), ix(0), rem << 0);
position(rem);
limit(capacity());
discardMark();
return this;

}

比如一段空间内容为:

123456789

当position的位置在2时，调用compact方法，会变成：

345678989

#### isDirect()
这个方法用于判断是否是直接内存。如果是返回true，如果不是返回false。

#### rewind()
这个方法用于重置mark标记：

public final Buffer rewind() {
position = 0;
mark = -1;
return this;
}