最近在做一个需求,评估Java列表10万数据加载到内容占用空间大小,以及对服务器内存使用影响。以前都是从书上看Java内存布局相关知识,借这个机会深入分析Java对象占用内存空间及实战,加深对Java内存布局的理解。
简单回顾Java对象内存布局:对象头(Header),实例数据(Instance Data)和对齐填充(Padding)。另外,不同环境Java对象占用内存空间可能有所差异。本文实验环境如下,HotSpot 64-Bit虚拟机,默认开启指针压缩(-XX:+UseCompressedOops),结合如图1,所以Java对象实例的对象头大小为12bytes(8bytes makOop + 4 bytes klassOop), Java数组实例的对象头大小为16bytes(8bytes makOop + 4 bytes klassOop + 4 bytes length);64位Linux系统,所以字节对齐必须是8的倍数。
xxx:~$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
本文验证Java对象占用内存空间使用的方法是:org.apache.lucene.util.RamUsageEstimator#sizeOf(java.lang.Object)
,计算的对象大小包含本体对象和引用对象的大小,对应jar包版本:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>4.2.0</version>
</dependency>
原生类型(primitive type)
一般技术文章介绍原生类型占用的存储空间总会列举下面表格。那new一个long对象,占用的内存空间是不是8 bytes呢?从图1Java对象内存布局分析看,肯定不止8 bytes。
Primitive Type | Memory Required(bytes) |
---|---|
byte, boolean | 1 |
short, char | 2 |
int, float | 4 |
long, double | 8 |
下面举例分析Java原生类型对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
boolean bool = true;
byte b = (byte)0xFF;
short s = (short)1;
char c = 'c';
int i = 1;
float f = 1.0f;
long l = 1L;
double d = 1.0;
System.out.printf("sizeOf(byte) = %s bytes\n", RamUsageEstimator.sizeOf(b));
System.out.printf("sizeOf(boolean) = %s bytes\n", RamUsageEstimator.sizeOf(bool));
System.out.printf("sizeOf(short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
System.out.printf("sizeOf(char) = %s bytes\n", RamUsageEstimator.sizeOf(c));
System.out.printf("sizeOf(int) = %s bytes\n", RamUsageEstimator.sizeOf(i));
System.out.printf("sizeOf(float) = %s bytes\n", RamUsageEstimator.sizeOf(f));
System.out.printf("sizeOf(long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
System.out.printf("sizeOf(double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
}
}
执行结果:
sizeOf(byte) = 16 bytes
sizeOf(boolean) = 16 bytes
sizeOf(short) = 16 bytes
sizeOf(char) = 16 bytes
sizeOf(int) = 16 bytes
sizeOf(float) = 16 bytes
sizeOf(long) = 24 bytes
sizeOf(double) = 24 bytes
分析原生类型对象占用内存空间情况:
sizeOf(byte)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(boolean)=12(Header) + 1(Instance Data) + 3(Padding)=16 bytes
sizeOf(short)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(char)=12(Header) + 2(Instance Data) + 2(Padding)=16 bytes
sizeOf(int)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(float)=12(Header) + 4(Instance Data)=16 bytes
sizeOf(long)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes
sizeOf(double)=12(Header) + 8(Instance Data) + 4(Padding)=24 bytes
下面进一步举例分析Java原生类型的包装类对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
Boolean bool = true;
Byte b = (byte)0xFF;
Short s = (short)1;
Character c = 'c';
Integer i = 1;
Float f = 1.0f;
Long l = 1L;
Double d = 1.0;
System.out.printf("sizeOf(Boolean) = %s bytes\n", RamUsageEstimator.sizeOf(b));
System.out.printf("sizeOf(Byte) = %s bytes\n", RamUsageEstimator.sizeOf(bool));
System.out.printf("sizeOf(Short) = %s bytes\n", RamUsageEstimator.sizeOf(s));
System.out.printf("sizeOf(Character) = %s bytes\n", RamUsageEstimator.sizeOf(c));
System.out.printf("sizeOf(Integer) = %s bytes\n", RamUsageEstimator.sizeOf(i));
System.out.printf("sizeOf(Float) = %s bytes\n", RamUsageEstimator.sizeOf(f));
System.out.printf("sizeOf(Long) = %s bytes\n", RamUsageEstimator.sizeOf(l));
System.out.printf("sizeOf(Double) = %s bytes\n", RamUsageEstimator.sizeOf(d));
}
}
执行结果与原生类型对象内存布局分析一致。
sizeOf(Boolean) = 16 bytes
sizeOf(Byte) = 16 bytes
sizeOf(Short) = 16 bytes
sizeOf(Character) = 16 bytes
sizeOf(Integer) = 16 bytes
sizeOf(Float) = 16 bytes
sizeOf(Long) = 24 bytes
sizeOf(Double) = 24 bytes
特殊对象
下面举例分析null和Object对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
System.out.printf("sizeOf(null) = %s bytes\n", RamUsageEstimator.sizeOf((Object)null));
System.out.printf("sizeOf(new Object()) = %s bytes\n", RamUsageEstimator.sizeOf(new Object()));
}
}
执行结果如下,说明null对象在内存中不分配任何空间;
sizeOf(new Object())=12(Header) + 4(Padding)=16 bytes。
sizeOf(null) = 0 bytes
sizeOf(new Object()) = 16 bytes
数组
下面举例分析Java数组对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
int[] array0 = new int[0];
int[] array1 = new int[1];
int[] array2 = new int[2];
int[] array3 = new int[3];
int[] array8 = new int[8];
int[] array9 = new int[9];
System.out.printf("sizeOf(array0) = %s bytes\n", RamUsageEstimator.sizeOf(array0));
System.out.printf("length(array0) = %s bytes\n", array0.length);
System.out.printf("sizeOf(array1) = %s bytes\n", RamUsageEstimator.sizeOf(array1));
System.out.printf("sizeOf(array2) = %s bytes\n", RamUsageEstimator.sizeOf(array2));
System.out.printf("sizeOf(array3) = %s bytes\n", RamUsageEstimator.sizeOf(array3));
System.out.printf("sizeOf(array8) = %s bytes\n", RamUsageEstimator.sizeOf(array8));
System.out.printf("sizeOf(array9) = %s bytes\n", RamUsageEstimator.sizeOf(array9));
}
}
执行结果:
sizeOf(array0) = 16 bytes
length(array0) = 0 bytes
sizeOf(array1) = 24 bytes
sizeOf(array2) = 24 bytes
sizeOf(array3) = 32 bytes
sizeOf(array8) = 48 bytes
sizeOf(array9) = 56 bytes
参考图1,Java数组实例的对象头为16bytes,区别与Java对象实例,分析数组实例占用内存空间情况如下:
sizeOf(array0)=16(Header)=16 bytes
length(array0)=0
sizeOf(array1)=16(Header) + 4(int) + 4(Padding)=24 bytes
sizeOf(array2)=16(Header) + 4(int)*2=24 bytes
sizeOf(array3)=16(Header) + 4(int)*3 + 4(Padding)=32 bytes
sizeOf(array8)=16(Header) + 4(int)*8=48 bytes
sizeOf(array9)=16(Header) + 4(int)*9 + 4(Padding)=56 bytes
String
在JDK1.7及以上版本中,String部分源码如下,包含String的4个属性变量,static变量属于类,不属于实例对象,存放在全局数据段,普通变量才纳入Java对象占用空间的计算,一个用于存放字符串数据的char[], 一个int类型的hashcode。关于static属性字段不纳入Java对象占用堆空间的验证请看下面自定义对象
一节。
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** Cache the hash code for the string */
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
}
因此,一个String本身需要 12(Header) + 4(char[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一个char[]占用16(Array Header) + length * 2 bytes(8字节对齐),length是字符串长度,参考图2,一个String对象占用的内存空间大小为:
40 + length * 2 bytes + Padding
下面举例分析Java String对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
String s0 = "";
String s1 = "a";
String s2 = "aa";
String s4 = "aaaa";
String s5 = "aaaaa";
System.out.printf("sizeOf(s0) = %s bytes\n", RamUsageEstimator.sizeOf(s0));
System.out.printf("sizeOf(s1) = %s bytes\n", RamUsageEstimator.sizeOf(s1));
System.out.printf("sizeOf(s2) = %s bytes\n", RamUsageEstimator.sizeOf(s2));
System.out.printf("sizeOf(s4) = %s bytes\n", RamUsageEstimator.sizeOf(s4));
System.out.printf("sizeOf(s5) = %s bytes\n", RamUsageEstimator.sizeOf(s5));
}
}
执行结果:
sizeOf(s0) = 40 bytes
sizeOf(s1) = 48 bytes
sizeOf(s2) = 48 bytes
sizeOf(s4) = 48 bytes
sizeOf(s5) = 56 bytes
对上述字符串执行结果分析:
sizeOf(s0)=40 + 0 * 2 = 40 bytes
sizeOf(s1)=40 + 1 * 2 + 6(Padding) = 48 bytes
sizeOf(s2)=40 + 2 * 2 + 4(Padding) = 48 bytes
sizeOf(s4)=40 + 4 * 2 = 48 bytes
sizeOf(s2)=40 + 5 * 2 + 6(Padding) = 56 bytes
自定义对象
下面举例分析Java自定义对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private long id;
private int age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
执行结果:
sizeOf(Employee) = 24 bytes
参看图3,从Java对象内存布局分析数组对象占用内存空间:
sizeOf(Employee) = 12(Header) + 8(long) + 4(int) = 24 bytes
Employee自定义对象新增一个static字段,如下:
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private long id;
private int age;
// static变量属于类,不属于实例,存放在全局数据段
private static int staticField = 88;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
执行结果如下,证明static变量属于类,不属于实例,存放在全局数据段,普通变量才纳入Java对象占用空间的计算。
sizeOf(Employee) = 24 bytes
Employee自定义对象引用其他Java对象,如下,引用一个Long和Integer对象:
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
public class Employee {
private Long id;
private Integer age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
}
}
执行结果:
sizeOf(Employee) = 64 bytes
参看图4,从Java对象内存布局分析数组对象占用内存空间:
sizeOf(Employee) = 24(Employee Object) + 24(Long Object) + 16(Integer Object) =64 bytes
ArrayList
在JDK1.7及以上版本中,ArrayList部分源码如下,包含String的6个属性,static变量属于类,不属于实例,存放在全局数据段,普通变量才纳入Java对象占用空间的计算,一个用于存放数组元素的Object[], 一个int类型的size。
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
private static final long serialVersionUID = 8683452581122892189L;
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
transient Object[] elementData; // non-private to simplify nested class access
/**
* The size of the ArrayList (the number of elements it contains).
*/
private int size;
}
因此,一个ArrayList本身需要 12(Header) + 4(Object[] reference) + 4(int) + 4(Padding) = 24 bytes。
除此之外,一个Object[]占用16(Array Header) + length * 4(Object reference) bytes(8字节对齐),length是Object[]长度,即ArrayList容量,size是ArrayList存放的元素数量,其中length >= size,另加数组初始化的Object占用的内存空间,结合图5,所以一个ArrayList占用的内存空间大小为:
((40 + length * 4)(8字节对齐) + size * n bytes)(8字节对齐),假设Object对象占用n bytes,size * n表示只有在数组初始化的Object才需要分配内存空间。
下面举例分析ArrayList对象占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;
public class RamUsageEstimatorTest {
public static void main(String[] args) {
System.out.printf("sizeOf(ArrayList with 0 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>(0)));
System.out.printf("sizeOf(ArrayList with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(new ArrayList<>()));
List<Integer> list1 = new ArrayList<>(1);
list1.add(1);
System.out.printf("sizeOf(list1 with 1 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
list1 = new ArrayList<>();
list1.add(1);
System.out.printf("sizeOf(list1 with default capacity) = %s bytes\n", RamUsageEstimator.sizeOf(list1));
}
}
执行结果如下:
sizeOf(ArrayList with 0 capacity) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes
sizeOf(ArrayList with 0 capacity) = 40 bytes
分析:构造函数指定initialCapacity=0,sizeOf(ArrayList with 0 capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(ArrayList with default capacity) = 40 bytes
分析:构造函数new ArrayList()创建elementData为空,sizeOf(ArrayList with default capacity) = 40 + 0 * 4(int reference) + 0 * 16(int) = 40 bytes
sizeOf(list1 with 1 capacity) = 64 bytes
分析:构造函数指定initialCapacity=1,sizeOf(ArrayList with 0 capacity) = 40 + 1 * 4(int reference) + 1 * 16(int) + 4(Padding) = 64 bytes
sizeOf(list1 with default capacity) = 96 bytes
分析:构造函数new ArrayList()创建elementData为空,当第一次调用add()方法添加元素时,初始化elementData默认最小容量为10,size=1。所以sizeOf(list1 with default capacity) = 40 + 10 * 4(int reference) + 1 * 16(int) = 96 bytes
Java列表10万数据占用内存空间
下面举例分析如何评估Java列表10万数据占用内存空间。
package study.estimator;
import org.apache.lucene.util.RamUsageEstimator;
import java.util.ArrayList;
import java.util.List;
public class Employee {
private long id;
private int age;
public Employee(long id, int age) {
this.id = id;
this.age = age;
}
public static void main(String[] args) {
System.out.printf("sizeOf(Employee) = %s bytes\n", RamUsageEstimator.sizeOf(new Employee(123456789L, 28)));
List<Employee> employeeList = new ArrayList<>(100000);
for (int i = 0; i < 100000; i++) {
employeeList.add(new Employee(123456789L, 28));
}
System.out.printf("sizeOf(List<Employee> contains 10000 Employee object with 10000 capacity) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
employeeList = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
employeeList.add(new Employee(123456789L, 28));
}
System.out.printf("sizeOf(List<Employee> contains 10000 Employee object) = %s bytes\n", RamUsageEstimator.sizeOf(employeeList));
}
}
执行结果如下:
sizeOf(Employee) = 24 bytes
sizeOf(List<Employee> contains 100000 Employee object with 10000 capacity) = 2800040 bytes
sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes
根据上一节对ArrayList对象的分析:
sizeOf(List<Employee> contains 100000 Employee object with 100000 capacity) = 2800040 bytes = 40 + 100000 * 4(Employee Reference) + 100000 * 24(Employee Object)
如果在new ArrayList没有指定capacity或者列表大小大于capacity,列表的elementData会进行扩容,将老数组中的元素重新拷贝一份到新的数组中,每次elementData扩容的增长是原容量的1.5倍。所以为了扩容ArrayList以放置10000数据,capacity初始值默认为10,capacity最终值为106710,计算如下:
package study;
public class StudyTest {
public static void main(String[] args) {
int capacity = 10;
while (true) {
capacity += capacity * 0.5;
if (capacity >= 100000) {
break;
}
}
System.out.println(capacity);
}
}
最后,上述执行结果的最后一行分析如下:
sizeOf(List<Employee> contains 100000 Employee object) = 2826880 bytes = 40 + 106710 * 4(Employee Reference) + 100000 * 24(Employee Object) = 2.696MB
延伸实践
- 大家可以根据上面分析方法实践HashMap、枚举类或者自定义对象。
- 结合上述代码,大家可以使用-XX:-UseCompressedOops关闭压缩指针,执行代码验证对象头大小变化对Java对象占用内存空间的影响。