Java IO性能分析
概述
程序中的IO操作可以说是很常见,很常见的,所以在IO操作这方面的性能就很有必要进行一些分析然后改进程序,IO操作非常容易导致性能的瓶颈。
本文主要涉及的是传统的IO(TIO),少部分涉及到新型IO(NIO以及NIO2),在分析方面,对要从对磁盘文件的读写进行分析,其他网络IO,窗口IO等不做过多的分析。
这里的IO基本都属于低级IO。
TIO 分析
常规IO操作类型
- 使用单字节/字符进行读写,直接调用read() 和 write(int)方法
 
- 建立缓冲区,进行块读写,调用read(char/byte[] buf)和write(char/byte[] buf)等方法
 
- 使用io包中的缓冲读写类
 
耗时测试
因为java的字符流类在底层进行字节读取后,再转换成字符返回,所以需要花费更多的时间,所以这里就不比较字符流中各个操作的差别了,差别类似于字节流的,而且会更加明显。
先写一些工具方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
   | import java.io.*;
  public class WriteDataTest{     public static void generateByte(String filename, long size) throws Exception{         File file = new File(filename);         if(!file.exists())             file.createNewFile();         FileOutputStream fos = new FileOutputStream(file);         for(long i = 0; i < size; i++){             fos.write((int)(Math.random()*128));         }         fos.close();
      }     public static void generateBytes(String filename, long size, int byteSize) throws Exception{         File file = new File(filename);         if(!file.exists())             file.createNewFile();         FileOutputStream fos = new FileOutputStream(file);         if(byteSize < 1)             byteSize = 512;         if(byteSize > size)             byteSize = (int)size;         byte[] buf = new byte[byteSize];         for(long i = 0; i < size; i += byteSize){             for(int j = 0; j < byteSize; j++)                 buf[j] = (byte)(int)(Math.random()*128);             fos.write(buf);         }         fos.close();     }     public static void generateBuffer(String filename, long size, int bufferSize) throws Exception{         File file = new File(filename);         if(!file.exists())             file.createNewFile();         FileOutputStream fos = new FileOutputStream(file);         BufferedOutputStream bos;         if(bufferSize < 1)             bos = new BufferedOutputStream(fos);         else             bos = new BufferedOutputStream(fos, bufferSize);         for(long i = 0; i < size; i++){             bos.write((int)(Math.random()*128));         }         bos.close();     } } class MyTimer{     private long startTime = -1;     private long endTime = -1;     private long costTime = -1;     public boolean start(){         if(startTime != -1)             return false;         startTime = System.currentTimeMillis();         return true;     }     public boolean stop(){         if(startTime == -1)             return false;         endTime = System.currentTimeMillis();         costTime = endTime - startTime;         startTime = -1;         return true;     }     public long getTime(){         return costTime;     }     @Override     public String toString(){         return "" + costTime;     } }
   | 
 
比较三种IO方式写的耗时
测试三种方式生成10次不同大小的数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
   | public static void main(String... args) throws Exception{
      System.out.println("Generating data...");     long[] size = new long[]{128*1024, 256*1024, 512*1024, 1024*1024, 2*1024*1024, 4*1024*1024, 8*1024*1024};     String[] sizeName = new String[]{"128K", "256K", "512K", "1M", "2M", "4M", "8M"};     String filename = "TestData";     int count = 10;     MyTimer timer = new MyTimer();     for(int i = 0; i < size.length; i++){         timer.start();         System.out.println(">> generateBytes");         for(int j = 0; j < count; j++){             generateBytes(filename, size[i], 1024);         }         timer.stop();         System.out.printf("size: %s  costTime: %dms  avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));
 
          timer.start();         System.out.println(">> generateBuffer");         for(int j = 0; j < count; j++){             generateBuffer(filename, size[i], 1024);         }         timer.stop();         System.out.printf("size: %s  costTime: %dms  avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));
          timer.start();         System.out.println(">> generateByte");         for(int j = 0; j < count; j++){             generateByte(filename, size[i]);         }         timer.stop();         System.out.printf("size: %s  costTime: %dms  avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));     }     System.out.println("\nDone.");
  }
  | 
 
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
   | Generating data... >> generateBytes size: 128K  costTime: 53ms  avergeTime: 5.30ms >> generateBuffer size: 128K  costTime: 53ms  avergeTime: 5.30ms >> generateByte size: 128K  costTime: 2024ms  avergeTime: 202.40ms >> generateBytes size: 256K  costTime: 76ms  avergeTime: 7.60ms >> generateBuffer size: 256K  costTime: 78ms  avergeTime: 7.80ms >> generateByte size: 256K  costTime: 4026ms  avergeTime: 402.60ms >> generateBytes size: 512K  costTime: 159ms  avergeTime: 15.90ms >> generateBuffer size: 512K  costTime: 156ms  avergeTime: 15.60ms >> generateByte size: 512K  costTime: 8022ms  avergeTime: 802.20ms >> generateBytes size: 1M  costTime: 294ms  avergeTime: 29.40ms >> generateBuffer size: 1M  costTime: 298ms  avergeTime: 29.80ms >> generateByte size: 1M  costTime: 16606ms  avergeTime: 1660.60ms >> generateBytes size: 2M  costTime: 619ms  avergeTime: 61.90ms >> generateBuffer size: 2M  costTime: 667ms  avergeTime: 66.70ms >> generateByte size: 2M  costTime: 34323ms  avergeTime: 3432.30ms >> generateBytes size: 4M  costTime: 1269ms  avergeTime: 126.90ms >> generateBuffer size: 4M  costTime: 1272ms  avergeTime: 127.20ms >> generateByte size: 4M  costTime: 68344ms  avergeTime: 6834.40ms >> generateBytes size: 8M  costTime: 2491ms  avergeTime: 249.10ms >> generateBuffer size: 8M  costTime: 2622ms  avergeTime: 262.20ms >> generateByte size: 8M  costTime: 139931ms  avergeTime: 13993.10ms
  Done.
   | 
 
明显,单字节的读写明显慢于其他两种方式的读写,而块读写在这里的速度稍快于缓冲读写,下面会进行更加仔细的测试
块读写方式的块大小对速度的影响
对generateBytes方法的缓冲区块大小进行测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
   | public static void main(String... args) throws Exception{
      System.out.println("Generating data...");     long size = 1024 * 1024;     String filename = "1M";     int count = 10;     MyTimer timer = new MyTimer();
      for(int i = 2; i < 4096; i *= 2){         System.out.println("byteSize: " + i);         timer.start();         System.out.println(">> generateBytes");         for(int j = 0; j < count; j++){             generateBytes(filename, size, i);         }         timer.stop();         System.out.printf("count: %d  costTime: %dms  avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
      }     System.out.println("\nDone.");
  }
  | 
 
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
   | Generating data... byteSize: 128 >> generateBytes count: 10  costTime: 456ms  avergeTime: 45.60ms byteSize: 256 >> generateBytes count: 10  costTime: 337ms  avergeTime: 33.70ms byteSize: 512 >> generateBytes count: 10  costTime: 307ms  avergeTime: 30.70ms byteSize: 1024 >> generateBytes count: 10  costTime: 290ms  avergeTime: 29.00ms byteSize: 2048 >> generateBytes count: 10  costTime: 304ms  avergeTime: 30.40ms byteSize: 4096 >> generateBytes count: 10  costTime: 278ms  avergeTime: 27.80ms byteSize: 8192 >> generateBytes count: 10  costTime: 287ms  avergeTime: 28.70ms byteSize: 16384 >> generateBytes count: 10  costTime: 279ms  avergeTime: 27.90ms byteSize: 32768 >> generateBytes count: 10  costTime: 286ms  avergeTime: 28.60ms
  Done.
   | 
 
根据上面的运行结果,可以看出块大小对读写速度是有影响的,不过也不是说块大小越大读写速度越快,到后面速度增长幅度很小
缓冲读写的缓冲区大小对读写速度的影响
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
   | public static void main(String... args) throws Exception{
      System.out.println("Generating data...");     long size = 1024 * 1024;     String filename = "1M";     int count = 10;     MyTimer timer = new MyTimer();
      for(int i = 2; i < 4096 * 16; i *= 2){         System.out.println("byteSize: " + i);         timer.start();         System.out.println(">> generateBuffer");         for(int j = 0; j < count; j++){             generateBuffer(filename, size, i);         }         timer.stop();         System.out.printf("count: %d  costTime: %dms  avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
      }     System.out.println("\nDone.");
  }
  | 
 
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
   | Generating data... byteSize: 2 >> generateBuffer count: 10  costTime: 9106ms  avergeTime: 910.60ms byteSize: 4 >> generateBuffer count: 10  costTime: 4740ms  avergeTime: 474.00ms byteSize: 8 >> generateBuffer count: 10  costTime: 2581ms  avergeTime: 258.10ms byteSize: 16 >> generateBuffer count: 10  costTime: 1408ms  avergeTime: 140.80ms byteSize: 32 >> generateBuffer count: 10  costTime: 877ms  avergeTime: 87.70ms byteSize: 64 >> generateBuffer count: 10  costTime: 599ms  avergeTime: 59.90ms byteSize: 128 >> generateBuffer count: 10  costTime: 459ms  avergeTime: 45.90ms byteSize: 256 >> generateBuffer count: 10  costTime: 397ms  avergeTime: 39.70ms byteSize: 512 >> generateBuffer count: 10  costTime: 361ms  avergeTime: 36.10ms byteSize: 1024 >> generateBuffer count: 10  costTime: 341ms  avergeTime: 34.10ms byteSize: 2048 >> generateBuffer count: 10  costTime: 307ms  avergeTime: 30.70ms byteSize: 4096 >> generateBuffer count: 10  costTime: 297ms  avergeTime: 29.70ms byteSize: 8192 >> generateBuffer count: 10  costTime: 294ms  avergeTime: 29.40ms byteSize: 16384 >> generateBuffer count: 10  costTime: 306ms  avergeTime: 30.60ms byteSize: 32768 >> generateBuffer count: 10  costTime: 306ms  avergeTime: 30.60ms
  Done.
   | 
 
跟块读写一样,缓冲区大小对缓冲读写的速度有所影响,越到后面变化幅度越小
块读写块大小跟缓冲读写缓冲大小一致时,速度的差别
测试不同缓冲大小下,两者之间的速度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
   |     for(int i = 2; i < 4096 * 16; i *= 2){         System.out.println("byteSize: " + i);         timer.start();         System.out.println(">> generateBytes");         for(int j = 0; j < count; j++){             generateBytes(filename, size, i);         }         timer.stop();         System.out.printf("count: %d  costTime: %dms  avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
          timer.start();         System.out.println(">> generateBuffer");         for(int j = 0; j < count; j++){             generateBuffer(filename, size, i);         }         timer.stop();         System.out.printf("count: %d  costTime: %dms  avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));     }     System.out.println("\nDone.");
  }
  | 
 
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
   | Generating data... byteSize: 2 >> generateBytes count: 10  costTime: 8532ms  avergeTime: 853.20ms >> generateBuffer count: 10  costTime: 8558ms  avergeTime: 855.80ms byteSize: 4 >> generateBytes count: 10  costTime: 4405ms  avergeTime: 440.50ms >> generateBuffer count: 10  costTime: 4441ms  avergeTime: 444.10ms byteSize: 8 >> generateBytes count: 10  costTime: 2395ms  avergeTime: 239.50ms >> generateBuffer count: 10  costTime: 2368ms  avergeTime: 236.80ms byteSize: 16 >> generateBytes count: 10  costTime: 1326ms  avergeTime: 132.60ms >> generateBuffer count: 10  costTime: 1340ms  avergeTime: 134.00ms byteSize: 32 >> generateBytes count: 10  costTime: 821ms  avergeTime: 82.10ms >> generateBuffer count: 10  costTime: 814ms  avergeTime: 81.40ms byteSize: 64 >> generateBytes count: 10  costTime: 559ms  avergeTime: 55.90ms >> generateBuffer count: 10  costTime: 559ms  avergeTime: 55.90ms byteSize: 128 >> generateBytes count: 10  costTime: 419ms  avergeTime: 41.90ms >> generateBuffer count: 10  costTime: 458ms  avergeTime: 45.80ms byteSize: 256 >> generateBytes count: 10  costTime: 350ms  avergeTime: 35.00ms >> generateBuffer count: 10  costTime: 367ms  avergeTime: 36.70ms byteSize: 512 >> generateBytes count: 10  costTime: 338ms  avergeTime: 33.80ms >> generateBuffer count: 10  costTime: 333ms  avergeTime: 33.30ms byteSize: 1024 >> generateBytes count: 10  costTime: 312ms  avergeTime: 31.20ms >> generateBuffer count: 10  costTime: 309ms  avergeTime: 30.90ms byteSize: 2048 >> generateBytes count: 10  costTime: 300ms  avergeTime: 30.00ms >> generateBuffer count: 10  costTime: 303ms  avergeTime: 30.30ms byteSize: 4096 >> generateBytes count: 10  costTime: 292ms  avergeTime: 29.20ms >> generateBuffer count: 10  costTime: 317ms  avergeTime: 31.70ms byteSize: 8192 >> generateBytes count: 10  costTime: 284ms  avergeTime: 28.40ms >> generateBuffer count: 10  costTime: 297ms  avergeTime: 29.70ms byteSize: 16384 >> generateBytes count: 10  costTime: 297ms  avergeTime: 29.70ms >> generateBuffer count: 10  costTime: 310ms  avergeTime: 31.00ms byteSize: 32768 >> generateBytes count: 10  costTime: 323ms  avergeTime: 32.30ms >> generateBuffer count: 10  costTime: 291ms  avergeTime: 29.10ms
  Done.
   | 
 
生成4M数据,然后每次重复25次,从块大小128开始测试的结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
   | Generating data... byteSize: 128 >> generateBytes count: 25  costTime: 3896ms  avergeTime: 155.84ms >> generateBuffer count: 25  costTime: 3932ms  avergeTime: 157.28ms byteSize: 256 >> generateBytes count: 25  costTime: 3151ms  avergeTime: 126.04ms >> generateBuffer count: 25  costTime: 3194ms  avergeTime: 127.76ms byteSize: 512 >> generateBytes count: 25  costTime: 2831ms  avergeTime: 113.24ms >> generateBuffer count: 25  costTime: 2854ms  avergeTime: 114.16ms byteSize: 1024 >> generateBytes count: 25  costTime: 2665ms  avergeTime: 106.60ms >> generateBuffer count: 25  costTime: 2690ms  avergeTime: 107.60ms byteSize: 2048 >> generateBytes count: 25  costTime: 2578ms  avergeTime: 103.12ms >> generateBuffer count: 25  costTime: 2643ms  avergeTime: 105.72ms byteSize: 4096 >> generateBytes count: 25  costTime: 2516ms  avergeTime: 100.64ms >> generateBuffer count: 25  costTime: 2544ms  avergeTime: 101.76ms byteSize: 8192 >> generateBytes count: 25  costTime: 2502ms  avergeTime: 100.08ms >> generateBuffer count: 25  costTime: 2539ms  avergeTime: 101.56ms byteSize: 16384 >> generateBytes count: 25  costTime: 2607ms  avergeTime: 104.28ms >> generateBuffer count: 25  costTime: 2561ms  avergeTime: 102.44ms byteSize: 32768 >> generateBytes count: 25  costTime: 2500ms  avergeTime: 100.00ms >> generateBuffer count: 25  costTime: 5704ms  avergeTime: 228.16ms
  Done.
   | 
 
这里的话,块读写的要比缓冲读写的快一点点,(如果这里读写更多的数据,重复更多次,差距会明显一些)
测试结论
- 同等情况下,单字节读写速度远远比不上块读写或者缓冲读写,大概50倍左右吧,原因是因为每读写一次,都需要从文件中去读写,大量读写操作时,开销太大。
 
- 块读写时,块大小会影响读写速度,越大速度越快,但是影响幅度会越来越小;缓冲读写也一样。块大小适当就好了,太大会导致内存问题,太小又会导致读写速度过慢。建议1024或者2048。缓冲读写的话,无特殊情况,使用默认的就可以了,无需自己手动指定。
 
- 块读写跟缓冲读写相比,块读写要快上一些,主要原因是因为缓冲读写其实也是在原有的基础上进行的装饰封装而已,在这过程中,就会多了很多不必要的操作,比如过多的“无必要的”栈调用。所以速度会比块读写要慢一些。
 
使用RandomAccessFile类进行读写
这种方式的读写会更加灵活,跟C/C++的文件读写很想象,可以使用seek()函数来移动当前在文件中的位置,更方便更灵活的读取数据。同时也提供一些其他的write和read方法,可以更加方便读写不同类型的数据。
读写的时候,也是分两种方式,一种直接单字节读写,一种是自己建立缓冲区,然后进行块读写。
其他问题
在jdk中的io,也就是java.io包中的东西,都是线程安全的,虽然这里的线程安全做得很不错,但是当如果我的程序单单只运行在一个线程下的,加上这些线程安全的保证就是没有必要的了,即使JVM会进行优化,但是还是会拖慢读写速度;其次,自己也完全可以通过自己的方式实现线程安全的。
当你确定你的代码运行在单一线程环境下,或者你自己添加了同步保护时,建议使用com.liferay.portal.kernel.io.unsync包中的IO类。它们能大幅提高你的应用的IO性能。