dcLunatic's blog

Java IO性能分析

字数统计: 2.9k阅读时长: 14 min
2018/09/21 Share

Java IO性能分析

概述

程序中的IO操作可以说是很常见,很常见的,所以在IO操作这方面的性能就很有必要进行一些分析然后改进程序,IO操作非常容易导致性能的瓶颈。

本文主要涉及的是传统的IO(TIO),少部分涉及到新型IO(NIO以及NIO2),在分析方面,对要从对磁盘文件的读写进行分析,其他网络IO,窗口IO等不做过多的分析。

这里的IO基本都属于低级IO。

TIO 分析

常规IO操作类型

  1. 使用单字节/字符进行读写,直接调用read() 和 write(int)方法
  2. 建立缓冲区,进行块读写,调用read(char/byte[] buf)和write(char/byte[] buf)等方法
  3. 使用io包中的缓冲读写类

耗时测试

因为java的字符流类在底层进行字节读取后,再转换成字符返回,所以需要花费更多的时间,所以这里就不比较字符流中各个操作的差别了,差别类似于字节流的,而且会更加明显。

先写一些工具方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import java.io.*;

public class WriteDataTest{
public static void generateByte(String filename, long size) throws Exception{
File file = new File(filename);
if(!file.exists())
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
for(long i = 0; i < size; i++){
fos.write((int)(Math.random()*128));
}
fos.close();

}
public static void generateBytes(String filename, long size, int byteSize) throws Exception{
File file = new File(filename);
if(!file.exists())
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
if(byteSize < 1)
byteSize = 512;
if(byteSize > size)
byteSize = (int)size;
byte[] buf = new byte[byteSize];
for(long i = 0; i < size; i += byteSize){
for(int j = 0; j < byteSize; j++)
buf[j] = (byte)(int)(Math.random()*128);
fos.write(buf);
}
fos.close();
}
public static void generateBuffer(String filename, long size, int bufferSize) throws Exception{
File file = new File(filename);
if(!file.exists())
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
BufferedOutputStream bos;
if(bufferSize < 1)
bos = new BufferedOutputStream(fos);
else
bos = new BufferedOutputStream(fos, bufferSize);
for(long i = 0; i < size; i++){
bos.write((int)(Math.random()*128));
}
bos.close();
}
}
class MyTimer{
private long startTime = -1;
private long endTime = -1;
private long costTime = -1;
public boolean start(){
if(startTime != -1)
return false;
startTime = System.currentTimeMillis();
return true;
}
public boolean stop(){
if(startTime == -1)
return false;
endTime = System.currentTimeMillis();
costTime = endTime - startTime;
startTime = -1;
return true;
}
public long getTime(){
return costTime;
}
@Override
public String toString(){
return "" + costTime;
}
}

比较三种IO方式写的耗时

测试三种方式生成10次不同大小的数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
public static void main(String... args) throws Exception{

System.out.println("Generating data...");
long[] size = new long[]{128*1024, 256*1024, 512*1024, 1024*1024, 2*1024*1024, 4*1024*1024, 8*1024*1024};
String[] sizeName = new String[]{"128K", "256K", "512K", "1M", "2M", "4M", "8M"};
String filename = "TestData";
int count = 10;
MyTimer timer = new MyTimer();
for(int i = 0; i < size.length; i++){
timer.start();
System.out.println(">> generateBytes");
for(int j = 0; j < count; j++){
generateBytes(filename, size[i], 1024);
}
timer.stop();
System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));


timer.start();
System.out.println(">> generateBuffer");
for(int j = 0; j < count; j++){
generateBuffer(filename, size[i], 1024);
}
timer.stop();
System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));

timer.start();
System.out.println(">> generateByte");
for(int j = 0; j < count; j++){
generateByte(filename, size[i]);
}
timer.stop();
System.out.printf("size: %s costTime: %dms avergeTime: %.2fms\n", sizeName[i], timer.getTime(), (double)(timer.getTime() / (count*1.0)));
}
System.out.println("\nDone.");

}

测试结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
Generating data...
>> generateBytes
size: 128K costTime: 53ms avergeTime: 5.30ms
>> generateBuffer
size: 128K costTime: 53ms avergeTime: 5.30ms
>> generateByte
size: 128K costTime: 2024ms avergeTime: 202.40ms
>> generateBytes
size: 256K costTime: 76ms avergeTime: 7.60ms
>> generateBuffer
size: 256K costTime: 78ms avergeTime: 7.80ms
>> generateByte
size: 256K costTime: 4026ms avergeTime: 402.60ms
>> generateBytes
size: 512K costTime: 159ms avergeTime: 15.90ms
>> generateBuffer
size: 512K costTime: 156ms avergeTime: 15.60ms
>> generateByte
size: 512K costTime: 8022ms avergeTime: 802.20ms
>> generateBytes
size: 1M costTime: 294ms avergeTime: 29.40ms
>> generateBuffer
size: 1M costTime: 298ms avergeTime: 29.80ms
>> generateByte
size: 1M costTime: 16606ms avergeTime: 1660.60ms
>> generateBytes
size: 2M costTime: 619ms avergeTime: 61.90ms
>> generateBuffer
size: 2M costTime: 667ms avergeTime: 66.70ms
>> generateByte
size: 2M costTime: 34323ms avergeTime: 3432.30ms
>> generateBytes
size: 4M costTime: 1269ms avergeTime: 126.90ms
>> generateBuffer
size: 4M costTime: 1272ms avergeTime: 127.20ms
>> generateByte
size: 4M costTime: 68344ms avergeTime: 6834.40ms
>> generateBytes
size: 8M costTime: 2491ms avergeTime: 249.10ms
>> generateBuffer
size: 8M costTime: 2622ms avergeTime: 262.20ms
>> generateByte
size: 8M costTime: 139931ms avergeTime: 13993.10ms

Done.

明显,单字节的读写明显慢于其他两种方式的读写,而块读写在这里的速度稍快于缓冲读写,下面会进行更加仔细的测试

块读写方式的块大小对速度的影响

对generateBytes方法的缓冲区块大小进行测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public static void main(String... args) throws Exception{

System.out.println("Generating data...");
long size = 1024 * 1024;
String filename = "1M";
int count = 10;
MyTimer timer = new MyTimer();

for(int i = 2; i < 4096; i *= 2){
System.out.println("byteSize: " + i);
timer.start();
System.out.println(">> generateBytes");
for(int j = 0; j < count; j++){
generateBytes(filename, size, i);
}
timer.stop();
System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));

}
System.out.println("\nDone.");

}

测试结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Generating data...
byteSize: 128
>> generateBytes
count: 10 costTime: 456ms avergeTime: 45.60ms
byteSize: 256
>> generateBytes
count: 10 costTime: 337ms avergeTime: 33.70ms
byteSize: 512
>> generateBytes
count: 10 costTime: 307ms avergeTime: 30.70ms
byteSize: 1024
>> generateBytes
count: 10 costTime: 290ms avergeTime: 29.00ms
byteSize: 2048
>> generateBytes
count: 10 costTime: 304ms avergeTime: 30.40ms
byteSize: 4096
>> generateBytes
count: 10 costTime: 278ms avergeTime: 27.80ms
byteSize: 8192
>> generateBytes
count: 10 costTime: 287ms avergeTime: 28.70ms
byteSize: 16384
>> generateBytes
count: 10 costTime: 279ms avergeTime: 27.90ms
byteSize: 32768
>> generateBytes
count: 10 costTime: 286ms avergeTime: 28.60ms

Done.

根据上面的运行结果,可以看出块大小对读写速度是有影响的,不过也不是说块大小越大读写速度越快,到后面速度增长幅度很小

缓冲读写的缓冲区大小对读写速度的影响

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public static void main(String... args) throws Exception{

System.out.println("Generating data...");
long size = 1024 * 1024;
String filename = "1M";
int count = 10;
MyTimer timer = new MyTimer();

for(int i = 2; i < 4096 * 16; i *= 2){
System.out.println("byteSize: " + i);
timer.start();
System.out.println(">> generateBuffer");
for(int j = 0; j < count; j++){
generateBuffer(filename, size, i);
}
timer.stop();
System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));

}
System.out.println("\nDone.");

}

测试结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Generating data...
byteSize: 2
>> generateBuffer
count: 10 costTime: 9106ms avergeTime: 910.60ms
byteSize: 4
>> generateBuffer
count: 10 costTime: 4740ms avergeTime: 474.00ms
byteSize: 8
>> generateBuffer
count: 10 costTime: 2581ms avergeTime: 258.10ms
byteSize: 16
>> generateBuffer
count: 10 costTime: 1408ms avergeTime: 140.80ms
byteSize: 32
>> generateBuffer
count: 10 costTime: 877ms avergeTime: 87.70ms
byteSize: 64
>> generateBuffer
count: 10 costTime: 599ms avergeTime: 59.90ms
byteSize: 128
>> generateBuffer
count: 10 costTime: 459ms avergeTime: 45.90ms
byteSize: 256
>> generateBuffer
count: 10 costTime: 397ms avergeTime: 39.70ms
byteSize: 512
>> generateBuffer
count: 10 costTime: 361ms avergeTime: 36.10ms
byteSize: 1024
>> generateBuffer
count: 10 costTime: 341ms avergeTime: 34.10ms
byteSize: 2048
>> generateBuffer
count: 10 costTime: 307ms avergeTime: 30.70ms
byteSize: 4096
>> generateBuffer
count: 10 costTime: 297ms avergeTime: 29.70ms
byteSize: 8192
>> generateBuffer
count: 10 costTime: 294ms avergeTime: 29.40ms
byteSize: 16384
>> generateBuffer
count: 10 costTime: 306ms avergeTime: 30.60ms
byteSize: 32768
>> generateBuffer
count: 10 costTime: 306ms avergeTime: 30.60ms

Done.

跟块读写一样,缓冲区大小对缓冲读写的速度有所影响,越到后面变化幅度越小

块读写块大小跟缓冲读写缓冲大小一致时,速度的差别

测试不同缓冲大小下,两者之间的速度

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
    for(int i = 2; i < 4096 * 16; i *= 2){
System.out.println("byteSize: " + i);
timer.start();
System.out.println(">> generateBytes");
for(int j = 0; j < count; j++){
generateBytes(filename, size, i);
}
timer.stop();
System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));

timer.start();
System.out.println(">> generateBuffer");
for(int j = 0; j < count; j++){
generateBuffer(filename, size, i);
}
timer.stop();
System.out.printf("count: %d costTime: %dms avergeTime: %.2fms\n", count, timer.getTime(), timer.getTime() / (count*1.0));
}
System.out.println("\nDone.");

}

测试结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Generating data...
byteSize: 2
>> generateBytes
count: 10 costTime: 8532ms avergeTime: 853.20ms
>> generateBuffer
count: 10 costTime: 8558ms avergeTime: 855.80ms
byteSize: 4
>> generateBytes
count: 10 costTime: 4405ms avergeTime: 440.50ms
>> generateBuffer
count: 10 costTime: 4441ms avergeTime: 444.10ms
byteSize: 8
>> generateBytes
count: 10 costTime: 2395ms avergeTime: 239.50ms
>> generateBuffer
count: 10 costTime: 2368ms avergeTime: 236.80ms
byteSize: 16
>> generateBytes
count: 10 costTime: 1326ms avergeTime: 132.60ms
>> generateBuffer
count: 10 costTime: 1340ms avergeTime: 134.00ms
byteSize: 32
>> generateBytes
count: 10 costTime: 821ms avergeTime: 82.10ms
>> generateBuffer
count: 10 costTime: 814ms avergeTime: 81.40ms
byteSize: 64
>> generateBytes
count: 10 costTime: 559ms avergeTime: 55.90ms
>> generateBuffer
count: 10 costTime: 559ms avergeTime: 55.90ms
byteSize: 128
>> generateBytes
count: 10 costTime: 419ms avergeTime: 41.90ms
>> generateBuffer
count: 10 costTime: 458ms avergeTime: 45.80ms
byteSize: 256
>> generateBytes
count: 10 costTime: 350ms avergeTime: 35.00ms
>> generateBuffer
count: 10 costTime: 367ms avergeTime: 36.70ms
byteSize: 512
>> generateBytes
count: 10 costTime: 338ms avergeTime: 33.80ms
>> generateBuffer
count: 10 costTime: 333ms avergeTime: 33.30ms
byteSize: 1024
>> generateBytes
count: 10 costTime: 312ms avergeTime: 31.20ms
>> generateBuffer
count: 10 costTime: 309ms avergeTime: 30.90ms
byteSize: 2048
>> generateBytes
count: 10 costTime: 300ms avergeTime: 30.00ms
>> generateBuffer
count: 10 costTime: 303ms avergeTime: 30.30ms
byteSize: 4096
>> generateBytes
count: 10 costTime: 292ms avergeTime: 29.20ms
>> generateBuffer
count: 10 costTime: 317ms avergeTime: 31.70ms
byteSize: 8192
>> generateBytes
count: 10 costTime: 284ms avergeTime: 28.40ms
>> generateBuffer
count: 10 costTime: 297ms avergeTime: 29.70ms
byteSize: 16384
>> generateBytes
count: 10 costTime: 297ms avergeTime: 29.70ms
>> generateBuffer
count: 10 costTime: 310ms avergeTime: 31.00ms
byteSize: 32768
>> generateBytes
count: 10 costTime: 323ms avergeTime: 32.30ms
>> generateBuffer
count: 10 costTime: 291ms avergeTime: 29.10ms

Done.

生成4M数据,然后每次重复25次,从块大小128开始测试的结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Generating data...
byteSize: 128
>> generateBytes
count: 25 costTime: 3896ms avergeTime: 155.84ms
>> generateBuffer
count: 25 costTime: 3932ms avergeTime: 157.28ms
byteSize: 256
>> generateBytes
count: 25 costTime: 3151ms avergeTime: 126.04ms
>> generateBuffer
count: 25 costTime: 3194ms avergeTime: 127.76ms
byteSize: 512
>> generateBytes
count: 25 costTime: 2831ms avergeTime: 113.24ms
>> generateBuffer
count: 25 costTime: 2854ms avergeTime: 114.16ms
byteSize: 1024
>> generateBytes
count: 25 costTime: 2665ms avergeTime: 106.60ms
>> generateBuffer
count: 25 costTime: 2690ms avergeTime: 107.60ms
byteSize: 2048
>> generateBytes
count: 25 costTime: 2578ms avergeTime: 103.12ms
>> generateBuffer
count: 25 costTime: 2643ms avergeTime: 105.72ms
byteSize: 4096
>> generateBytes
count: 25 costTime: 2516ms avergeTime: 100.64ms
>> generateBuffer
count: 25 costTime: 2544ms avergeTime: 101.76ms
byteSize: 8192
>> generateBytes
count: 25 costTime: 2502ms avergeTime: 100.08ms
>> generateBuffer
count: 25 costTime: 2539ms avergeTime: 101.56ms
byteSize: 16384
>> generateBytes
count: 25 costTime: 2607ms avergeTime: 104.28ms
>> generateBuffer
count: 25 costTime: 2561ms avergeTime: 102.44ms
byteSize: 32768
>> generateBytes
count: 25 costTime: 2500ms avergeTime: 100.00ms
>> generateBuffer
count: 25 costTime: 5704ms avergeTime: 228.16ms

Done.

这里的话,块读写的要比缓冲读写的快一点点,(如果这里读写更多的数据,重复更多次,差距会明显一些)

测试结论

  1. 同等情况下,单字节读写速度远远比不上块读写或者缓冲读写,大概50倍左右吧,原因是因为每读写一次,都需要从文件中去读写,大量读写操作时,开销太大。
  2. 块读写时,块大小会影响读写速度,越大速度越快,但是影响幅度会越来越小;缓冲读写也一样。块大小适当就好了,太大会导致内存问题,太小又会导致读写速度过慢。建议1024或者2048。缓冲读写的话,无特殊情况,使用默认的就可以了,无需自己手动指定。
  3. 块读写跟缓冲读写相比,块读写要快上一些,主要原因是因为缓冲读写其实也是在原有的基础上进行的装饰封装而已,在这过程中,就会多了很多不必要的操作,比如过多的“无必要的”栈调用。所以速度会比块读写要慢一些。

使用RandomAccessFile类进行读写

这种方式的读写会更加灵活,跟C/C++的文件读写很想象,可以使用seek()函数来移动当前在文件中的位置,更方便更灵活的读取数据。同时也提供一些其他的write和read方法,可以更加方便读写不同类型的数据。

读写的时候,也是分两种方式,一种直接单字节读写,一种是自己建立缓冲区,然后进行块读写。

其他问题

在jdk中的io,也就是java.io包中的东西,都是线程安全的,虽然这里的线程安全做得很不错,但是当如果我的程序单单只运行在一个线程下的,加上这些线程安全的保证就是没有必要的了,即使JVM会进行优化,但是还是会拖慢读写速度;其次,自己也完全可以通过自己的方式实现线程安全的。

当你确定你的代码运行在单一线程环境下,或者你自己添加了同步保护时,建议使用com.liferay.portal.kernel.io.unsync包中的IO类。它们能大幅提高你的应用的IO性能。

原文作者:dcLunatic

原文链接:http://dclunatic.github.io/Java-IO%E5%88%86%E6%9E%90.html

发表日期:September 21st 2018, 1:00:23 pm

更新日期:July 11th 2021, 9:13:50 pm

版权声明:转载的时候,记得注明来处

CATALOG
  1. 1. Java IO性能分析
    1. 1.1. 概述
    2. 1.2. TIO 分析
      1. 1.2.1. 常规IO操作类型
      2. 1.2.2. 耗时测试
        1. 1.2.2.1. 比较三种IO方式写的耗时
        2. 1.2.2.2. 块读写方式的块大小对速度的影响
        3. 1.2.2.3. 缓冲读写的缓冲区大小对读写速度的影响
        4. 1.2.2.4. 块读写块大小跟缓冲读写缓冲大小一致时,速度的差别
      3. 1.2.3. 测试结论
      4. 1.2.4. 使用RandomAccessFile类进行读写
      5. 1.2.5. 其他问题