转自: https://chawlasumit.wordpress.com/2016/03/14/hadoop-gc-overhead-limit-exceeded-error/
In our Hadoop setup, we ended up having more than 1 million files in a single folder. The folder had so many files, that any hdfs dfs command like -ls, -copyToLocal on the files was giving following error:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuffer.append(StringBuffer.java:237)
at java.net.URI.appendAuthority(URI.java:1852)
at java.net.URI.appendSchemeSpecificPart(URI.java:1890)
at java.net.URI.toString(URI.java:1922)
at java.net.URI.<init>(URI.java:749 at org.apache.hadoop.fs.Path.initialize(Path.java:203)
at org.apache.hadoop.fs.Path.<init>(Path.java:116 at org.apache.hadoop.fs.Path.<init>(Path.java:94 at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:230)
at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(HdfsFileStatus.java:263)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:732)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
After doing some research, we added following environment variable to update Hadoop runtime options.
export HADOOP_OPTS="-XX:-UseGCOverheadLimit"
Adding this option fixed the GC error, but started throwing the following error, citing the lack of Java Heap space.
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1351)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1413)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1524)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1533)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:557)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.getListing(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1969)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:724)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.shell.PathData.getDirectoryContents(PathData.java:268)
at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:347)
at org.apache.hadoop.fs.shell.CommandWithDestination.recursePath(CommandWithDestination.java:291)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:308)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
We modified the above export, and tried following instead. Note that instead of HADOOP_OPTS, we needed to set HADOOP_CLIENT_OPTS fix this error. This was needed because all the hadoop commands run as a client. HADOOP_OPTS needs to be setup for modifying actual Hadoop run time, and HADOOP_CLIENT_OPTS is needed to be setup for modifying run time for Hadoop command line client.
export HADOOP_CLIENT_OPTS="-XX:-UseGCOverheadLimit -Xmx4096m"