Java IO性能分析
概述
程序中的IO操作可以说是很常见,很常见的,所以在IO操作这方面的性能就很有必要进行一些分析然后改进程序,IO操作非常容易导致性能的瓶颈。
本文主要涉及的是传统的IO(TIO),少部分涉及到新型IO(NIO以及NIO2),在分析方面,对要从对磁盘文件的读写进行分析,其他网络IO,窗口IO等不做过多的分析。
这里的IO基本都属于低级IO。
TIO 分析
常规IO操作类型
- 使用单字节/字符进行读写,直接调用read() 和 write(int)方法
- 建立缓冲区,进行块读写,调用read(char/byte[] buf)和write(char/byte[] buf)等方法
- 使用io包中的缓冲读写类
耗时测试
因为java的字符流类在底层进行字节读取后,再转换成字符返回,所以需要花费更多的时间,所以这里就不比较字符流中各个操作的差别了,差别类似于字节流的,而且会更加明显。
先写一些工具方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
| import java.io.*;
public class WriteDataTest{ public static void generateByte(String filename, long size) throws Exception{ File file = new File(filename); if(!file.exists()) file.createNewFile(); FileOutputStream fos = new FileOutputStream(file); for(long i = 0; i < size; i++){ fos.write((int)(Math.random()*128)); } fos.close();
} public static void generateBytes(String filename, long size, int byteSize) throws Exception{ File file = new File(filename); if(!file.exists()) file.createNewFile(); FileOutputStream fos = new FileOutputStream(file); if(byteSize < 1) byteSize = 512; if(byteSize > size) byteSize = (int)size; byte[] buf = new byte[byteSize]; for(long i = 0; i < size; i += byteSize){ for(int j = 0; j < byteSize; j++) buf[j] = (byte)(int)(Math.random()*128); fos.write(buf); } fos.close(); } public static void generateBuffer(String filename, long size, int bufferSize) throws Exception{ File file = new File(filename); if(!file.exists()) file.createNewFile(); FileOutputStream fos = new FileOutputStream(file); BufferedOutputStream bos; if(bufferSize < 1) bos = new BufferedOutputStream(fos); else bos = new BufferedOutputStream(fos, bufferSize); for(long i = 0; i < size; i++){ bos.write((int)(Math.random()*128)); } bos.close(); } } class MyTimer{ private long startTime = -1; private long endTime = -1; private long costTime = -1; public boolean start(){ if(startTime != -1) return false; startTime = System.currentTimeMillis(); return true; } public boolean stop(){ if(startTime == -1) return false; endTime = System.currentTimeMillis(); costTime = endTime - startTime; startTime = -1; return true; } public long getTime(){ return costTime; } @Override public String toString(){ return "" + costTime; } }
|
比较三种IO方式写的耗时
测试三种方式生成10次不同大小的数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| public static void main(String... args) throws Exception{
System.out.println("Generating data..."); long[] size = new long[]{128*1024, 256*1024, 512*1024, 1024*1024, 2*1024*1024, 4*1024*1024, 8*1024*1024}; String[] sizeName = new String[]{"128K", "256K", "512K", "1M", "2M", "4M", "8M"}; String filename = "TestData"; int count = 10; MyTimer timer = new MyTimer(); for(int i = 0; i < size.length; i++){ timer.start(); System.out.println(">> generateBytes"); for(int j = 0; j < count; j++){ generateBytes(filename, size[i], 1024); } timer.stop(); System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));
timer.start(); System.out.println(">> generateBuffer"); for(int j = 0; j < count; j++){ generateBuffer(filename, size[i], 1024); } timer.stop(); System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));
timer.start(); System.out.println(">> generateByte"); for(int j = 0; j < count; j++){ generateByte(filename, size[i]); } timer.stop(); System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0))); } System.out.println("\nDone.");
}
|
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| Generating data... >> generateBytes size: 128K costTime: 53ms avergeTime: 5.30ms >> generateBuffer size: 128K costTime: 53ms avergeTime: 5.30ms >> generateByte size: 128K costTime: 2024ms avergeTime: 202.40ms >> generateBytes size: 256K costTime: 76ms avergeTime: 7.60ms >> generateBuffer size: 256K costTime: 78ms avergeTime: 7.80ms >> generateByte size: 256K costTime: 4026ms avergeTime: 402.60ms >> generateBytes size: 512K costTime: 159ms avergeTime: 15.90ms >> generateBuffer size: 512K costTime: 156ms avergeTime: 15.60ms >> generateByte size: 512K costTime: 8022ms avergeTime: 802.20ms >> generateBytes size: 1M costTime: 294ms avergeTime: 29.40ms >> generateBuffer size: 1M costTime: 298ms avergeTime: 29.80ms >> generateByte size: 1M costTime: 16606ms avergeTime: 1660.60ms >> generateBytes size: 2M costTime: 619ms avergeTime: 61.90ms >> generateBuffer size: 2M costTime: 667ms avergeTime: 66.70ms >> generateByte size: 2M costTime: 34323ms avergeTime: 3432.30ms >> generateBytes size: 4M costTime: 1269ms avergeTime: 126.90ms >> generateBuffer size: 4M costTime: 1272ms avergeTime: 127.20ms >> generateByte size: 4M costTime: 68344ms avergeTime: 6834.40ms >> generateBytes size: 8M costTime: 2491ms avergeTime: 249.10ms >> generateBuffer size: 8M costTime: 2622ms avergeTime: 262.20ms >> generateByte size: 8M costTime: 139931ms avergeTime: 13993.10ms
Done.
|
明显,单字节的读写明显慢于其他两种方式的读写,而块读写在这里的速度稍快于缓冲读写,下面会进行更加仔细的测试
块读写方式的块大小对速度的影响
对generateBytes方法的缓冲区块大小进行测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| public static void main(String... args) throws Exception{
System.out.println("Generating data..."); long size = 1024 * 1024; String filename = "1M"; int count = 10; MyTimer timer = new MyTimer();
for(int i = 2; i < 4096; i *= 2){ System.out.println("byteSize: " + i); timer.start(); System.out.println(">> generateBytes"); for(int j = 0; j < count; j++){ generateBytes(filename, size, i); } timer.stop(); System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
} System.out.println("\nDone.");
}
|
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| Generating data... byteSize: 128 >> generateBytes count: 10 costTime: 456ms avergeTime: 45.60ms byteSize: 256 >> generateBytes count: 10 costTime: 337ms avergeTime: 33.70ms byteSize: 512 >> generateBytes count: 10 costTime: 307ms avergeTime: 30.70ms byteSize: 1024 >> generateBytes count: 10 costTime: 290ms avergeTime: 29.00ms byteSize: 2048 >> generateBytes count: 10 costTime: 304ms avergeTime: 30.40ms byteSize: 4096 >> generateBytes count: 10 costTime: 278ms avergeTime: 27.80ms byteSize: 8192 >> generateBytes count: 10 costTime: 287ms avergeTime: 28.70ms byteSize: 16384 >> generateBytes count: 10 costTime: 279ms avergeTime: 27.90ms byteSize: 32768 >> generateBytes count: 10 costTime: 286ms avergeTime: 28.60ms
Done.
|
根据上面的运行结果,可以看出块大小对读写速度是有影响的,不过也不是说块大小越大读写速度越快,到后面速度增长幅度很小
缓冲读写的缓冲区大小对读写速度的影响
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| public static void main(String... args) throws Exception{
System.out.println("Generating data..."); long size = 1024 * 1024; String filename = "1M"; int count = 10; MyTimer timer = new MyTimer();
for(int i = 2; i < 4096 * 16; i *= 2){ System.out.println("byteSize: " + i); timer.start(); System.out.println(">> generateBuffer"); for(int j = 0; j < count; j++){ generateBuffer(filename, size, i); } timer.stop(); System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
} System.out.println("\nDone.");
}
|
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
| Generating data... byteSize: 2 >> generateBuffer count: 10 costTime: 9106ms avergeTime: 910.60ms byteSize: 4 >> generateBuffer count: 10 costTime: 4740ms avergeTime: 474.00ms byteSize: 8 >> generateBuffer count: 10 costTime: 2581ms avergeTime: 258.10ms byteSize: 16 >> generateBuffer count: 10 costTime: 1408ms avergeTime: 140.80ms byteSize: 32 >> generateBuffer count: 10 costTime: 877ms avergeTime: 87.70ms byteSize: 64 >> generateBuffer count: 10 costTime: 599ms avergeTime: 59.90ms byteSize: 128 >> generateBuffer count: 10 costTime: 459ms avergeTime: 45.90ms byteSize: 256 >> generateBuffer count: 10 costTime: 397ms avergeTime: 39.70ms byteSize: 512 >> generateBuffer count: 10 costTime: 361ms avergeTime: 36.10ms byteSize: 1024 >> generateBuffer count: 10 costTime: 341ms avergeTime: 34.10ms byteSize: 2048 >> generateBuffer count: 10 costTime: 307ms avergeTime: 30.70ms byteSize: 4096 >> generateBuffer count: 10 costTime: 297ms avergeTime: 29.70ms byteSize: 8192 >> generateBuffer count: 10 costTime: 294ms avergeTime: 29.40ms byteSize: 16384 >> generateBuffer count: 10 costTime: 306ms avergeTime: 30.60ms byteSize: 32768 >> generateBuffer count: 10 costTime: 306ms avergeTime: 30.60ms
Done.
|
跟块读写一样,缓冲区大小对缓冲读写的速度有所影响,越到后面变化幅度越小
块读写块大小跟缓冲读写缓冲大小一致时,速度的差别
测试不同缓冲大小下,两者之间的速度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| for(int i = 2; i < 4096 * 16; i *= 2){ System.out.println("byteSize: " + i); timer.start(); System.out.println(">> generateBytes"); for(int j = 0; j < count; j++){ generateBytes(filename, size, i); } timer.stop(); System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
timer.start(); System.out.println(">> generateBuffer"); for(int j = 0; j < count; j++){ generateBuffer(filename, size, i); } timer.stop(); System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0)); } System.out.println("\nDone.");
}
|
测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
| Generating data... byteSize: 2 >> generateBytes count: 10 costTime: 8532ms avergeTime: 853.20ms >> generateBuffer count: 10 costTime: 8558ms avergeTime: 855.80ms byteSize: 4 >> generateBytes count: 10 costTime: 4405ms avergeTime: 440.50ms >> generateBuffer count: 10 costTime: 4441ms avergeTime: 444.10ms byteSize: 8 >> generateBytes count: 10 costTime: 2395ms avergeTime: 239.50ms >> generateBuffer count: 10 costTime: 2368ms avergeTime: 236.80ms byteSize: 16 >> generateBytes count: 10 costTime: 1326ms avergeTime: 132.60ms >> generateBuffer count: 10 costTime: 1340ms avergeTime: 134.00ms byteSize: 32 >> generateBytes count: 10 costTime: 821ms avergeTime: 82.10ms >> generateBuffer count: 10 costTime: 814ms avergeTime: 81.40ms byteSize: 64 >> generateBytes count: 10 costTime: 559ms avergeTime: 55.90ms >> generateBuffer count: 10 costTime: 559ms avergeTime: 55.90ms byteSize: 128 >> generateBytes count: 10 costTime: 419ms avergeTime: 41.90ms >> generateBuffer count: 10 costTime: 458ms avergeTime: 45.80ms byteSize: 256 >> generateBytes count: 10 costTime: 350ms avergeTime: 35.00ms >> generateBuffer count: 10 costTime: 367ms avergeTime: 36.70ms byteSize: 512 >> generateBytes count: 10 costTime: 338ms avergeTime: 33.80ms >> generateBuffer count: 10 costTime: 333ms avergeTime: 33.30ms byteSize: 1024 >> generateBytes count: 10 costTime: 312ms avergeTime: 31.20ms >> generateBuffer count: 10 costTime: 309ms avergeTime: 30.90ms byteSize: 2048 >> generateBytes count: 10 costTime: 300ms avergeTime: 30.00ms >> generateBuffer count: 10 costTime: 303ms avergeTime: 30.30ms byteSize: 4096 >> generateBytes count: 10 costTime: 292ms avergeTime: 29.20ms >> generateBuffer count: 10 costTime: 317ms avergeTime: 31.70ms byteSize: 8192 >> generateBytes count: 10 costTime: 284ms avergeTime: 28.40ms >> generateBuffer count: 10 costTime: 297ms avergeTime: 29.70ms byteSize: 16384 >> generateBytes count: 10 costTime: 297ms avergeTime: 29.70ms >> generateBuffer count: 10 costTime: 310ms avergeTime: 31.00ms byteSize: 32768 >> generateBytes count: 10 costTime: 323ms avergeTime: 32.30ms >> generateBuffer count: 10 costTime: 291ms avergeTime: 29.10ms
Done.
|
生成4M数据,然后每次重复25次,从块大小128开始测试的结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
| Generating data... byteSize: 128 >> generateBytes count: 25 costTime: 3896ms avergeTime: 155.84ms >> generateBuffer count: 25 costTime: 3932ms avergeTime: 157.28ms byteSize: 256 >> generateBytes count: 25 costTime: 3151ms avergeTime: 126.04ms >> generateBuffer count: 25 costTime: 3194ms avergeTime: 127.76ms byteSize: 512 >> generateBytes count: 25 costTime: 2831ms avergeTime: 113.24ms >> generateBuffer count: 25 costTime: 2854ms avergeTime: 114.16ms byteSize: 1024 >> generateBytes count: 25 costTime: 2665ms avergeTime: 106.60ms >> generateBuffer count: 25 costTime: 2690ms avergeTime: 107.60ms byteSize: 2048 >> generateBytes count: 25 costTime: 2578ms avergeTime: 103.12ms >> generateBuffer count: 25 costTime: 2643ms avergeTime: 105.72ms byteSize: 4096 >> generateBytes count: 25 costTime: 2516ms avergeTime: 100.64ms >> generateBuffer count: 25 costTime: 2544ms avergeTime: 101.76ms byteSize: 8192 >> generateBytes count: 25 costTime: 2502ms avergeTime: 100.08ms >> generateBuffer count: 25 costTime: 2539ms avergeTime: 101.56ms byteSize: 16384 >> generateBytes count: 25 costTime: 2607ms avergeTime: 104.28ms >> generateBuffer count: 25 costTime: 2561ms avergeTime: 102.44ms byteSize: 32768 >> generateBytes count: 25 costTime: 2500ms avergeTime: 100.00ms >> generateBuffer count: 25 costTime: 5704ms avergeTime: 228.16ms
Done.
|
这里的话,块读写的要比缓冲读写的快一点点,(如果这里读写更多的数据,重复更多次,差距会明显一些)
测试结论
- 同等情况下,单字节读写速度远远比不上块读写或者缓冲读写,大概50倍左右吧,原因是因为每读写一次,都需要从文件中去读写,大量读写操作时,开销太大。
- 块读写时,块大小会影响读写速度,越大速度越快,但是影响幅度会越来越小;缓冲读写也一样。块大小适当就好了,太大会导致内存问题,太小又会导致读写速度过慢。建议1024或者2048。缓冲读写的话,无特殊情况,使用默认的就可以了,无需自己手动指定。
- 块读写跟缓冲读写相比,块读写要快上一些,主要原因是因为缓冲读写其实也是在原有的基础上进行的装饰封装而已,在这过程中,就会多了很多不必要的操作,比如过多的“无必要的”栈调用。所以速度会比块读写要慢一些。
使用RandomAccessFile类进行读写
这种方式的读写会更加灵活,跟C/C++的文件读写很想象,可以使用seek()函数来移动当前在文件中的位置,更方便更灵活的读取数据。同时也提供一些其他的write和read方法,可以更加方便读写不同类型的数据。
读写的时候,也是分两种方式,一种直接单字节读写,一种是自己建立缓冲区,然后进行块读写。
其他问题
在jdk中的io,也就是java.io包中的东西,都是线程安全的,虽然这里的线程安全做得很不错,但是当如果我的程序单单只运行在一个线程下的,加上这些线程安全的保证就是没有必要的了,即使JVM会进行优化,但是还是会拖慢读写速度;其次,自己也完全可以通过自己的方式实现线程安全的。
当你确定你的代码运行在单一线程环境下,或者你自己添加了同步保护时,建议使用com.liferay.portal.kernel.io.unsync包中的IO类。它们能大幅提高你的应用的IO性能。