通过API访问HDFS

通过API操作HDFS

今天的主要内容

HDFS获取文件系统
HDFS文件上传
HDFS文件下载
HDFS目录创建
HDFS文件夹删除
HDFS文件名更改
HDFS文件详情查看
定位文件读取
FileSystem类的学习

1. HDFS获取文件系统

//获取文件系统
@Test
public void initHDFS() throws Exception{
    //1. 获取文件系统
    Configuration configuration = new Configuration();
    FileSystem fileSystem = FileSystem.get(configuration);
    
    //2. 打印文件系统到控制台
    System.out.println(fileSystem.toString());      
}

2. HDFS文件上传（测试参数优先级）

@Test
public void putFileToHdfs() throws Exception{
    Configuration conf = new Configuration();
    conf.set("dfs.replication", "2");       //代码优先级是最高的
    conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000");
    FileSystem fileSystem = FileSystem.get(conf);
        
    //上传文件
    fileSystem.copyFromLocalFile(new Path("hdfs.txt"), new Path("/user/anna/hdfs/test.txt"));
    
    //关闭资源
    fileSystem.close(); 
}

参数优先级：（1）客户端代码中设置的值 >（2）classpath 下的用户自定义配置文件 > （3）然后是服务器的默认配置

3. HDFS文件下载

public void copyToLocalFile(boolean delSrc,Path src,Path dst,boolean useRawLocalFileSystem)
                 throws IOException

delSrc - whether to delete the src
src - path
dst - path
useRawLocalFileSystem - whether to use RawLocalFileSystem as local file system or not.

@Test
public void testCopyToLocalFile() throws Exception{
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000");
    FileSystem fileSystem = FileSystem.get(conf);
        
    ///下载文件
    fileSystem.copyToLocalFile(false,new Path("/user/anna/hdfs/test.txt"), new Path("test.txt"),true);
    
    //关闭资源
    fileSystem.close(); 
}

4. HDFS目录创建

@Test
public void testMakedir() throws Exception{
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000");
    FileSystem fileSystem = FileSystem.get(conf);
        
    //目录创建
    fileSystem.mkdirs(new Path("/user/anna/test/hahaha"));
    
    //关闭资源
    fileSystem.close();
}

5. HDFS文件夹删除

@Test
public void testDelete() throws Exception{
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000");
    FileSystem fileSystem = FileSystem.get(conf);
        
    //文件夹删除
    fileSystem.delete(new Path("/user/anna/test/hahaha"),true);         //true表示递归删除
    
    //关闭资源
    fileSystem.close();
}

6. HDFS文件名更改

@Test
public void testRename() throws Exception{
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://10.9.190.111:9000");
    FileSystem fileSystem = FileSystem.get(conf);
        
    //文件名称更改
    fileSystem.rename(new Path("/user/anna/test/copy.txt"), new Path("/user/anna/test/copyRename.txt"));
    
    //关闭资源
    fileSystem.close();
}

7. HDFS文件详情查看

几种实现方法

1. public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException,IOException
    * 返回FileStatus型数组

2. public FileStatus[] listStatus(Path f,PathFilter filter) throws FileNotFoundException,IOException

3. public FileStatus[] listStatus(Path[] files,PathFilter filter) throws FileNotFoundException,IOException

    * 此时注意PathFilter是一个接口，里面只有一个方法:accept，本质是对文件进行筛选

    * Enumerate all files found in the list of directories passed in, calling listStatus(path, filter) on each one.

注意：以上方法返回的文件按照字母表顺序排列

代码：FileStatus[] listStatus(Path f)

//FileStatus[] listStatus(Path f)的使用
try {
    //创建与HDFS连接
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");

    //获得fileSystem
    FileSystem fileSystem = FileSystem.get(conf);

    //listStatus获取/test目录下信息
    FileStatus[] fileStatuses = fileSystem.listStatus(new Path("/test"));

    //遍历输出文件夹下文件
    for(FileStatus fileStatus :fileStatuses) {
        System.out.println(fileStatus.getPath() + "  " + new Date(fileStatus.getAccessTime()) + "  " + 
            fileStatus.getBlockSize() + "  " + fileStatus.getPermission());
    }
}catch(Exception e) {
    e.printStackTrace();
}
/*
在JDK1.8中输出结果为：
----------------------------------------------------------------------------
hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz  2012-07-26  134217728  rw-r--r--
hdfs://10.9.190.90:9000/test/hello.txt  2012-07-26  134217728  rw-r--r--
hdfs://10.9.190.90:9000/test/test2  1970-01-01  0  rwxr-xr-x
----------------------------------------------------------------------------
*/

代码：FileStatus[] listStatus(Path f,PathFilter filter)

try {
    //创建与HDFS连接
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");

    //获得fileSystem
    FileSystem fileSystem = FileSystem.get(conf);

    //列出目录下后缀为.md的文件相关信息
    FileStatus[] statuses = fileSystem.listStatus(new Path("/test/test2"), new PathFilter() {

        @Override
        public boolean accept(Path path) {
            // TODO Auto-generated method stub
            String string = path.toString();

            if(string.endsWith(".md"))
                return true;
            else
                return false;
        }
    });

    //列出文件信息
    for(FileStatus status : statuses) {
        System.out.println("Path : " + status.getPath() + "  Permisson : " + status.getPermission() + 
                "  Replication : " + status.getReplication());
    }
}catch(Exception e) {
    e.printStackTrace();
}

7. 定位文件读取

8. FileSystem类的学习

FileSystem的学习

今天的主要内容

对照官方文档进行FileSystem类的学习

FileSystem中的方法

  * boolean exists(Path p)
  
  * boolean isDirectory(Path p)

  * boolean isFile(Path p)
  
  * FileStatus getFileStatus(Path p)
  
  * Path getHomeDirectory()
  
  * FileStatus[] listStatus(Path path, PathFilter filter)
  
    FileStatus[] listStatus(Path path)
  
    FileStatus[] listStatus(Path[] paths, PathFilter filter)
  
    FileStatus[] listStatus(Path[] paths)
  
  * RemoteIterator[LocatedFileStatus] listLocatedStatus(Path path, PathFilter filter)
  
    RemoteIterator[LocatedFileStatus] listLocatedStatus(Path path)
  
    RemoteIterator[LocatedFileStatus] listFiles(Path path, boolean recursive)

  * BlockLocation[] getFileBlockLocations(FileStatus f, int s, int l)
  
    BlockLocation[] getFileBlockLocations(Path P, int S, int L)
  
  * long getDefaultBlockSize()
  
    long getDefaultBlockSize(Path p)
  
    long getBlockSize(Path p)

  * boolean mkdirs(Path p, FsPermission permission)
  
  * FSDataOutputStream create(Path, ...)
  
    FSDataOutputStream append(Path p, int bufferSize, Progressable progress)
  
    FSDataInputStream open(Path f, int bufferSize)
  
  * boolean delete(Path p, boolean recursive)
  
  * boolean rename(Path src, Path d)
  
  * void concat(Path p, Path sources[])
  
  * boolean truncate(Path p, long newLength)
  
  * interface RemoteIterator
          boolean hasNext()
          E next()

  * interface StreamCapabilities
          boolean hasCapability(capability)

准备工作

start-dfs.sh启动hadoop集群
eclipse进行hdfs文件系统的访问
- 导入相应的jar包

创建与hdfs的连接并获取FileSystem文件对象

第一种方式

  * public static FileSystem get(Configuration conf) throws IOException

  //创建与HDFS连接
  Configuration conf = new Configuration();
  conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");   //namenode上的IP地址  端口为：9000
  
  //获得fileSystem
  FileSystem fileSystem = FileSystem.get(conf);

第二种方式

  * public static FileSystem get(URI uri,Configuration conf,String user)
            throws IOException,
                   InterruptedException

  URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
  FileSystem fileSystem = FileSystem.get(new URI("hdfs://10.9.190.90:9000"),new Configuration(),"root");
  //此时工作目录会相应更改为/user/root

两种方式比较
- 第二种方式可能会抛出InterruptedException异常，因为
  - the static FileSystem get(URI uri, Configuration conf,String user) method MAY return a pre-existing instance of a filesystem client class—a class that may also be in use in other threads. The implementations of FileSystem shipped with Apache Hadoop do not make any attempt to synchronize access to the working directory field.（此时get方法可能会返回一个已经存在FileSystem对象，也就是存在线程异步问题，所以我们尽量用前一种方式来完成FileSystem对象的创建）

org.apache.hadoop.fs.FileSystem简介

The abstract FileSystem class is the original class to access Hadoop filesystems; non-abstract subclasses exist for all Hadoop-supported filesystems.（抽象基类FileSystem定义了对hadoop文件系统的操作）

All operations that take a Path to this interface MUST support relative paths. In such a case, they must be resolved relative to the working directory defined by setWorkingDirectory().（setWorkingDirectory()方法默认工作目录）

FileSystem中的getWorkingDirector()返回当前系统的工作目录

代码

  //获得与hdfs文件系统的连接
  Configuration conf = new Configuration();
  conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000");
  
  //获取文件系统对象
  FileSystem fileSystem = FileSystem.get(conf);
  
  //获取当前工作目录
  System.out.println("=========获取当前工作目录=============");
  System.out.println(fileSystem.getWorkingDirectory());
  
  //设置新的工作目录
  //System.out.println("=========设置新的工作目录=============");
  fileSystem.setWorkingDirectory(new Path("hdfs://10.9.190.90:9000/user/anna"));  //Path在hdfs中的作用和File作用类似，代表路径

结果

  =========获取当前工作目录=============
  hdfs://10.9.190.90:9000/user/root
  =========获取设置后工作目录=============
  hdfs://10.9.190.90:9000/user/anna

FileSystem方法——判断功能

预备知识

import org.apache.hadoop.fs.Path;类似于java.io.File代表hdfs的文件路径

方法
- public boolean exists(Path f) throws IOException
  - 判断文件是否存在
- public boolean isDirectory(Path f) throws IOException
  - 判断是否为目录
- public boolean isFile(Path f) throws IOException
  - 判断是否为文件

练习

 try {
     //获得与hdfs文件系统的连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000");
     
     //获取连接对象
     FileSystem fileSystem = FileSystem.get(conf);
     
     //判断文件是否存在
     System.out.println(fileSystem.exists(new Path("/test")));   //true
     
     //判断是否为目录
     System.out.println(fileSystem.isDirectory(new Path("/test")));  //true
     
     //判断是否为文件
     System.out.println(fileSystem.isFile(new Path("/test")));           //false
 }catch(Exception e) {
     e.printStackTrace();
 }

FileSystem方法——获取功能—文件信息获取

方法
- public abstract FileStatus getFileStatus(Path f) throws IOException
  - Return a file status object that represents the path.
  - 返回的是FileStatus对象类型
- public Path getHomeDirectory()
  - Return the current user's home directory in this FileSystem. The default implementation returns "/user/$USER/".
  - 返回当前用户的home目录

练习

 try {
     //获得与hdfs文件系统的连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS", "hdfs://10.9.190.90:9000");
     
     //获取连接对象
     FileSystem fileSystem = FileSystem.get(conf);
     
     //获取当前用户的home目录
     System.out.println("========当前用户的home目录============");
     Path path = fileSystem.getHomeDirectory();
     System.out.println(path);
     
     //获取文件状态对象
     System.out.println("============文件信息===============");
     FileStatus status = fileSystem.getFileStatus(new Path("/eclipse"));
     System.out.println("Path : " + status.getPath());
     System.out.println("isFile ? " + status.isFile());          
     System.out.println("Block size : " + status.getBlockSize());
     System.out.println("Perssions : " + status.getPermission());
     System.out.println("Replication : " + status.getReplication());
     System.out.println("isSymlink : " + status.isSymlink());            
     
 }catch(Exception e) {
     e.printStackTrace();
 }


 /*
     在JDK1.8中输出结果为：
  *  ------------------------------------------------
  *  ========当前用户的home目录============
     hdfs://10.9.190.90:9000/user/anna
     ============文件信息===============
     Path : hdfs://10.9.190.90:9000/eclipse
     isFile ? true
     Block size : 134217728
     Perssions : rw-r--r--
     Replication : 3
     isSymlink : false
     ------------------------------------------------
 */

FileStatus中常用方法
- public Path getPath()
- public boolean isFile()
- public boolean isSymlink()
- public long getBlockSize()
- public short getReplication()
- public FsPermission getPermission()

FileSystem方法——获取功能——文件夹遍历1

方法
- public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException,IOException
  - 返回FileStatus型数组
- public FileStatus[] listStatus(Path f,PathFilter filter)
  throws FileNotFoundException,IOException
- public FileStatus[] listStatus(Path[] files,PathFilter filter)
  throws FileNotFoundException,IOException
  - 此时注意PathFilter是一个接口，里面只有一个方法:accept，本质是对文件进行筛选
  - Enumerate all files found in the list of directories passed in, calling listStatus(path, filter) on each one.
- 注意：以上方法返回的文件按照字母表顺序排列

练习1——FileStatus[] listStatus(Path f)的使用

 //FileStatus[] listStatus(Path f)的使用
 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     //listStatus获取/test目录下信息
     FileStatus[] fileStatuses = fileSystem.listStatus(new Path("/test"));
     
     //遍历输出文件夹下文件
     for(FileStatus fileStatus :fileStatuses) {
         System.out.println(fileStatus.getPath() + "  " + new Date(fileStatus.getAccessTime()) + "  " + 
             fileStatus.getBlockSize() + "  " + fileStatus.getPermission());
     }
 }catch(Exception e) {
     e.printStackTrace();
 }
 /*
 在JDK1.8中输出结果为：
 ----------------------------------------------------------------------------
 hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz  2012-07-26  134217728  rw-r--r--
 hdfs://10.9.190.90:9000/test/hello.txt  2012-07-26  134217728  rw-r--r--
 hdfs://10.9.190.90:9000/test/test2  1970-01-01  0  rwxr-xr-x
 ----------------------------------------------------------------------------
 */

练习2——FileStatus[] listStatus(Path f,PathFilter filter)的使用

需求：列出/test/test2目录下以.md结尾的问价信息

代码：

  try {
      //创建与HDFS连接
      Configuration conf = new Configuration();
      conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
      
      //获得fileSystem
      FileSystem fileSystem = FileSystem.get(conf);
      
      //列出目录下后缀为.md的文件相关信息
      FileStatus[] statuses = fileSystem.listStatus(new Path("/test/test2"), new PathFilter() {
          
          @Override
          public boolean accept(Path path) {
              // TODO Auto-generated method stub
              String string = path.toString();
              
              if(string.endsWith(".md"))
                  return true;
              else
                  return false;
          }
      });
      
      //列出文件信息
      for(FileStatus status : statuses) {
          System.out.println("Path : " + status.getPath() + "  Permisson : " + status.getPermission() + 
                  "  Replication : " + status.getReplication());
      }
  }catch(Exception e) {
      e.printStackTrace();
  }

注意问题
- By the time the listStatus() operation returns to the caller, there is no guarantee that the information contained in the response is current. The details MAY be out of date, including the contents of any directory, the attributes of any files, and the existence of the path supplied.（listStatus()方法线程不安全）

FileSystem方法——获取功能——文件夹遍历2

方法
- public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f)
  throws FileNotFoundException, IOException
- protected org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f,PathFilter filter)
  throws FileNotFoundException, IOException
  - 注意：此方法是protected的，protected权限是：本类，同一包下(子类或无关类)，不同包下子类
- 注意：LocatedFileStatus是FileStatus的子类

使用

 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     //列出目录下后缀为.md的文件相关信息
     RemoteIterator<LocatedFileStatus> iterator = fileSystem.listLocatedStatus(new Path("/test/test2"));
     
     while(iterator.hasNext()) {
         LocatedFileStatus status = iterator.next();
         System.out.println("Path : " + status.getPath() + "  Permisson : " + status.getPermission() + 
                 "  Replication : " + status.getReplication());
     }           
 }catch(Exception e) {
     e.printStackTrace();
 }
 /*
  * 在JDK1.8中输出结果为：
  * ---------------------------------------------------------------------------------------------
  * Path : hdfs://10.9.190.90:9000/test/test2/Map.md  Permisson : rw-r--r--  Replication : 3
    Path : hdfs://10.9.190.90:9000/test/test2/biji.md  Permisson : rw-r--r--  Replication : 3
    Path : hdfs://10.9.190.90:9000/test/test2/haha.txt  Permisson : rw-r--r--  Replication : 3
    ---------------------------------------------------------------------------------------------
  * */

与listStatus（Path p）不同的是
- listStatus返回的是FileStatus[]数组类型，遍历时可通过数组for-each进行遍历
- listLocatedStatus(Path p)返回的是LocatedFileStatus类型的RemoteIterator集合，通过迭代器进行遍历输出
- 但是要注意的是listLocatedStatus()方法本质上内部还是listStatus(Path p)实现的

FileSystem方法——获取功能——文件夹遍历3

方法
- public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path f,boolean recursive)
  throws FileNotFoundException,IOException
  - 递归遍历出文件夹内容以及子文件夹中内容

使用

 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     //列出目录下后缀为.md的文件相关信息
     RemoteIterator<LocatedFileStatus> iterator = fileSystem.listFiles(new Path("/test"),true);
     
     while(iterator.hasNext()) {
         LocatedFileStatus status = iterator.next();
         System.out.println("Path : " + status.getPath() + "  Permisson : " + status.getPermission() + 
                 "  Replication : " + status.getReplication());
     }           
 }catch(Exception e) {
     e.printStackTrace();
 }

 /*
  *  在JDK1.8中输出结果为：
  *  ---------------------------------------------------------------------------------------------------
  *  Path : hdfs://10.9.190.90:9000/test/hadoop-2.7.3.tar.gz  Permisson : rw-r--r--  Replication : 3
     Path : hdfs://10.9.190.90:9000/test/hello.txt  Permisson : rw-r--r--  Replication : 3
     Path : hdfs://10.9.190.90:9000/test/test2/Map.md  Permisson : rw-r--r--  Replication : 3
     Path : hdfs://10.9.190.90:9000/test/test2/biji.md  Permisson : rw-r--r--  Replication : 3
     Path : hdfs://10.9.190.90:9000/test/test2/haha.txt  Permisson : rw-r--r--  Replication : 3
     ---------------------------------------------------------------------------------------------------
  * */

FileSystem方法——获取功能——获取文件block的位置

方法
- public BlockLocation[] getFileBlockLocations(Path p,long start,long len) throws IOException
- public BlockLocation[] getFileBlockLocations(FileStatus file,long start,long len) throws IOException

使用

 //查看/test/hadoop的block存放位置
 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     FileStatus status = fileSystem.getFileStatus(new Path("/test/hadoop"));
     
     BlockLocation[] locations = fileSystem.getFileBlockLocations(status, 0,status.getLen());
     
     for(BlockLocation location : locations) {
         System.out.println("host : " + location.getHosts() + " name : " + location.getNames() + " length : " + location.getLength());
     } 
 }catch(Exception e) {
     e.printStackTrace();
 }
 /*
 在JDK1.8中输出结果为：
 ------------------------------------------------------------------------------
 host : [Ljava.lang.String;@18ece7f4 name : [Ljava.lang.String;@3cce57c7 length : 134217728
 host : [Ljava.lang.String;@1cf56a1c name : [Ljava.lang.String;@33f676f6 length : 79874467
 ------------------------------------------------------------------------------
 */

FileSystem方法——获取功能——获取到某文件的输出流

方法
- public FSDataOutputStream create(Path f) throws IOException
- public FSDataOutputStream create(Path f,boolean overwrite)
  throws IOException
  - overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an exception will be thrown.
- public FSDataOutputStream create(Path f,
  Progressable progress)
  throws IOException
  - Create an FSDataOutputStream at the indicated Path with write-progress reporting. Files are overwritten by default.
- public FSDataOutputStream create(Path f,boolean overwrite,int bufferSize)
  throws IOException
- public FSDataOutputStream create(Path f,boolean overwrite,int bufferSize, Progressable progress)throws IOException
- FSDataOutputStream append(Path p, int bufferSize, Progressable progress)

使用——将本地E:/hzy.jpg上传到hdfs的/1.jpg

 public static void main(String[] args) {        
 BufferedInputStream in = null;
 FSDataOutputStream out = null;
 
 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     //获取本地文件输入流
     File file = new File("E:/hzy.jpg");
     in = new BufferedInputStream(new FileInputStream(file));
     final long fileSize = file.length();
     
     //获取到/test/hello.txt的输出流
     out = fileSystem.create(new Path("/1.jpg"),new Progressable() {
         long fileCount = 0;
         
         @Override
         public void progress() {
             // TODO Auto-generated method stub
             fileCount++;
             System.out.println("总进度：" + (fileCount/fileSize)*100 + " %");
         }
     });
     
     //拷贝
     int len = 0;
     while((len = in.read()) != -1) {
         out.write(len);                 //此时也可以用:IOUtils.copyBytes(in,out,conf);
     }
     
     in.close();
     out.close();
     
             
 }catch(Exception e) {
     e.printStackTrace();
 }finally {
     if(in != null) {
         try {
             in.close();
         } catch (IOException e) {
             // TODO Auto-generated catch block
             e.printStackTrace();
         }
     }
     if (out != null) {
         try {
             out.close();
         } catch (IOException e) {
             // TODO Auto-generated catch block
             e.printStackTrace();
         }
     }
 }

}

FileSystem方法——获取功能——获取到某文件的输入流——读取文件

方法
- public FSDataInputStream open(Path f) throws IOException
- public abstract FSDataInputStream open(Path f,int bufferSize)throws IOException

使用——将hdfs中的1.jpg拷贝到本地E:/hzy2.jpg

 try {
     //创建与HDFS连接
     Configuration conf = new Configuration();
     conf.set("fs.defaultFS","hdfs://10.9.190.90:9000");
     
     //获得fileSystem
     FileSystem fileSystem = FileSystem.get(conf);
     
     //获取hdfs文件输入流
     FSDataInputStream in = fileSystem.open(new Path("/1.jpg"));
     
     //获取本地输出流
     BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(new File("E:/hzyCopy.jpg")));
     
     int len = 0;
     byte[] bArr = new byte[1024*3];
     while((len = in.read(bArr)) != -1) {
         out.write(bArr,0,len);
     }
     
     in.close();
     out.close();
     
 }catch(Exception e) {
     e.printStackTrace();
 }

}

FileSystem方法——创建功能

public boolean mkdirs(Path f) throws IOException

FileSystem方法——删除功能

public abstract boolean delete(Path f,boolean recursive) throws IOException
设计线程同步问题

FileSystem方法——重命名功能

public abstract boolean rename(Path src,Path dst)throws IOException

FileSystem其他方法

public void concat(Path trg,Path[] psrcs)throws IOException
- Concat existing files together.
public boolean truncate(Path f,long newLength)throws IOException

interface RemoteIterator

定义

 public interface RemoteIterator<E> {
   boolean hasNext() throws IOException;
   E next() throws IOException;
 }

The primary use of RemoteIterator in the filesystem APIs is to list files on (possibly remote) filesystems.

使用

 //listLocatedFileStatus(Path f)
 public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f)
     throws FileNotFoundException,IOException


 //listLocatedStatus(Path f,PathFilter filter)
 protected org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listLocatedStatus(Path f,PathFilter filter)
     throws FileNotFoundException,IOException

 //listStatusIterator(Path p)
 public org.apache.hadoop.fs.RemoteIterator<FileStatus> listStatusIterator(Path p)
     throws FileNotFoundException,IOException

 //listFiles(Path f,boolean recursive)
 public org.apache.hadoop.fs.RemoteIterator<LocatedFileStatus> listFiles(Path f,boolean recursive)
     throws FileNotFoundException,IOException

interface StreamCapabilities

方法

 public interface StreamCapabilities {
   boolean hasCapability(String capability);
 }

使用

 hadoop2.7.3中无此方法，在2.9.1中才有

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 206,968评论 6赞 482
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 88,601评论 2赞 382
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 153,220评论 0赞 344
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 55,416评论 1赞 279
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 64,425评论 5赞 374
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,144评论 1赞 285
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,432评论 3赞 401
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,088评论 0赞 261
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 43,586评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,028评论 2赞 325
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,137评论 1赞 334
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,783评论 4赞 324
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,343评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,333评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,559评论 1赞 262
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,595评论 2赞 355
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,901评论 2赞 345

通过API访问HDFS

通过API操作HDFS

今天的主要内容

1. HDFS获取文件系统

2. HDFS文件上传（测试参数优先级）

3. HDFS文件下载

4. HDFS目录创建

5. HDFS文件夹删除

6. HDFS文件名更改

7. HDFS文件详情查看

7. 定位文件读取

8. FileSystem类的学习

FileSystem的学习

准备工作

org.apache.hadoop.fs.FileSystem简介

FileSystem方法——判断功能

FileSystem方法——获取功能—文件信息获取

FileSystem方法——获取功能——文件夹遍历1

FileSystem方法——获取功能——文件夹遍历2

FileSystem方法——获取功能——文件夹遍历3

FileSystem方法——获取功能——获取文件block的位置

FileSystem方法——获取功能——获取到某文件的输出流

FileSystem方法——获取功能——获取到某文件的输入流——读取文件

FileSystem方法——创建功能

FileSystem方法——删除功能

FileSystem方法——重命名功能

FileSystem其他方法

interface RemoteIterator

interface StreamCapabilities

推荐阅读更多精彩内容