Spring Data JPA @OneToMany及FetchMode.SUBSELECT性能问题及踩坑分析

发表于 2023-04-21 更新于 2025-07-31 分类于 Spring Data JPA ，性能优化

1 背景

项目中需要对外提供分页接口，在经过几次迭代后，BA发现分页接口出现了某些特定查询条件返回timeout的情况，经过排查发现，这是@OneToMany映射与FetchMode.SUBSELECT结合使用时的性能问题。

2. 代码示例

下面给出经过脱敏和简化后的部分项目代码以作说明。

2.1 Entity Info

BucketEntity Info

@Entity
@Table(name = "bucket")
public class BucketEntity {

    @Id
    @Column(name = "bucket_id")
    private String id;

    @Column(name = "bucket_number")
    private String number;

    @Column(name = "bucket_description")
    private String description;

    private String status;

    @Column(name = "updated_timestamp")
    private Instant timestamp;

    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
    @Fetch(value = FetchMode.SUBSELECT)
    @JoinColumn(name = "bucket_id", referencedColumnName = "bucket_id", nullable = false, updatable = false)
    private List<BucketLocationEntity> bucketLocations;
}

BucketLocationEntity info

@Entity
@Table(name = "bucket_location")
public class BucketLocationEntity implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "bucket_location_id")
    private Long id;

    @Column(name = "bucket_id", insertable = false, updatable = false)
    private String bucketId;

    private String location;

    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
    @Fetch(value = FetchMode.SUBSELECT)
    @JoinColumn(name = "bucket_location_id", referencedColumnName = "bucket_location_id", nullable = false, updatable = false)
    private List<BucketLocationZipCodeEntity> bucketLocationZipCodes;

    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
    @Fetch(value = FetchMode.SUBSELECT)
    @JoinColumn(name = "bucket_location_id", referencedColumnName = "bucket_location_id", nullable = false, updatable = false)
    private List<BucketSalesAreaEntity> bucketSalesAreas;
}

2.2 Controller code

@GetMapping("...")
public PageableResponse<List<BucketDto>> getBucketsByLocation(
        String location, Integer pageNumber, Integer pageSize) {

        Page<BucketDto> page = bucketService.findBucketsByLocation(location, pageNumber, pageSize);
        return transformToPageableResponse(page);
        }

2.3 Service code

@Service
@RequiredArgsConstructor
public class BucketService {

    private final BucketRepository bucketRepository;
    private final BucketDtoTransformer bucketDtoTransformer;

    public Page<BucketDto> findBucketsByLocation(String location, int pageNumber, int pageSize) {
        Pageable pageable = PageRequest.of(pageNumber - 1, pageSize);
        Page<BucketEntity> page = bucketRepository.findAllByBucketLocations_Location(location, pageable);

    ......

        return page.map(bucketEntity -> {
            return bucketDtoTransformer.transform(bucketEntity);
        });
    }
}

3 运行结果

3.1 paging query bucket sql

Hibernate: 
select
        bucketen0_.bucket_id as bucket1_1_,
        bucketen0_.bucket_number as bulket2_1_,
        bucketen0_.bucket_description as bucket3_1_,
        bucketen0_.status as status14_1_,
        bucketen0_.updated_timestamp as updated15_1_
    from
        bucket bucketen0_ 
    left outer join
        bucket_location bucketlo1_ 
            on bucketen0_.bucket_id=bucketlo1_.bucket_id 
    where
        bucketlo1_.location = ? limit ? offset ?

3.2 query bucket_location sql

Hibernate: 
select
        bucketlo0_.bucket_id as bucket1_8_1_,
        bucketlo0_.bucket_location_id as bucket1_8_0_,
        bucketlo0_.location as location2_8_0_
    from
        bucket_location bucketlo0_ 
    where
        bucketlo0_.bucket_id in (
            select
                bucketen0_.bucket_id 
            from
                bucket bucketen0_ 
            left outer join
                bucket_location bucketlo1_ 
                    on bucketen0_.bucket_id=bucketlo1_.bucket_id 
            where
                bucketlo1_.location = ?
        )

3.3 query bucket_location_zip_code sql

Hibernate:
select
        zipcode0_.bucket_location_id as bucket5_9_1_,
        zipcode0_.valuation_type_code as valuatio2_9_0_,
        zipcode0_.handling_type_code as handling3_9_0_,
        zipcode0_.handling_type_description as handling4_9_0_
    from
        bucket_location_zip_code zipcode0_
    where
        zipcode0_.bucket_location_id in (
            select
                bucketlo0_.bucket_location_id
            from
                bucket_location bucketlo0_
            where
                bucketlo0_.bucket_id in (
                    select
                        bucketen0_.bucket_id
                    from
                        bucket bucketen0_
                    left outer join
                        bucket_location bucketlo1_
                            on bucketen0_.bucket_id=bucketlo1_.bucket_id
                    where
                        bucketlo1_.location = ?
                )
            )

4 踩坑分析

4.1 慢sql查询分析

根据打印出来的sql可以看出，第一条sql查询bucket数据时是做了limit限制的。但第二条sql查询bucket_location数据时，我们发现limit条件没有出现在子查询语句中，这就导致了一次大数据量的查询。

从第三条sql语句中不难发现，与bucket_location OneToMany关联的bucket_location_zip_code查询中会将所有location与查询条件一致的数据项查询出来，这同样也是一次大数据量的查询。

由此引发的连锁反应会导致分页查询接口经常出现timeout的报错。

4.2 原因分析

当 Hibernate 加载父实体时，它会记住用于加载父实体时的查询语句。当 Hibernate 尝试加载子实体时，父查询将用作子查询。当 Hibernate 考虑 where 条件时，它不会记住父查询中的 limit 和 offset 子句。如果子表有数百万条记录，性能损失将是严重的。

5 解决方案

5.1 方案说明

避免FetchMode.SUBSELECT在查询子实体时出现忽略limit和offset的问题，我们可以将FetchMode改为SELECT，配上@BatchSize注解并将batchSize的值设置为分页接口的MAX_PAGE_SIZE。

5.2 代码细节

@Entity
@Table(name = "bucket")
public class BucketEntity {

    @Id
    @Column(name = "bucket_id")
    private String id;

    @Column(name = "bucket_number")
    private String number;

    @Column(name = "bucket_description")
    private String description;

    private String status;

    @Column(name = "updated_timestamp")
    private Instant timestamp;

    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
    @Fetch(value = FetchMode.SELECT)
    @BatchSize(size = MAX_PAGE_SIZE)
    @JoinColumn(name = "bucket_id", referencedColumnName = "bucket_id", nullable = false, updatable = false)
    private List<BucketLocationEntity> bucketLocations;
}

5.3 改进结果

修改之后，我们再次用相同的请求参数调用分页接口。

分页查询的sql和修改之前一样，此处就不贴了。

Hibernate: 
select
        bucketlo0_.bucket_id as bucket1_8_1_,
        bucketlo0_.bucket_location_id as bucket1_8_0_,
        bucketlo0_.location as location2_8_0_
    from
        bucket_location bucketlo0_ 
    where
        bucketlo0_.bucket_id in (
            ?, ?, ?, ?, ?
        )

Hibernate:
select
        zipcode0_.bucket_location_id as bucket5_9_1_,
        zipcode0_.valuation_type_code as valuatio2_9_0_,
        zipcode0_.handling_type_code as handling3_9_0_,
        zipcode0_.handling_type_description as handling4_9_0_
    from
        bucket_location_zip_code zipcode0_
    where
        zipcode0_.bucket_location_id in (
            select
                bucketlo0_.bucket_location_id 
            from
                bucket_location bucketlo0_ 
            where
                bucketlo0_.bucket_id in (
                    ?, ?, ?, ?, ?
                )
        )

可以看到改进之后，sql中的查询条件变为了bucketlo0_.bucket_id in (?, ?, ?, ?, ?) 这样的限定范围的in查询，这样我们就可以通过设置@BatchSize及其value来限定查询的个数，从而达到最有效和最快速查询的目的。