Skip to main content

Command Palette

Search for a command to run...

Series of Backend Developer Interview Questions Part 8

This article is part of a hands-on series of backend developer interview questions, focusing on real production problems rather than textbook theory. It dives deep into JPA performance pitfalls such as N+1 queries, EntityGraph vs fetch join, casc...

Published
10 min read
Series of Backend Developer Interview Questions Part 8
T

I am Tuanh.net. As of 2024, I have accumulated 8 years of experience in backend programming. I am delighted to connect and share my knowledge with everyone.

1. Question 1: When should you use EntityGraph instead of fetch join?

If you’ve ever written a JPQL join fetch and felt like a hero… until pagination exploded or Hibernate started yelling about “multiple bag fetch”, welcome to the club. A fetch join is great when the query’s shape is stable and you’re intentionally pulling a specific association in the same SQL, right now, for this one use case. The moment you need the same repository method to sometimes load extra relationships and sometimes not, EntityGraph tends to win because it lets you keep the query semantics (filtering/sorting) separate from the fetching plan. In other words: fetch join hardcodes the loading strategy into the query, while EntityGraph can be applied declaratively (or dynamically) without rewriting the JPQL every time.

EntityGraph is also a safer default for “list screens” where you need pagination and only want to avoid lazy-loading surprises. fetch join with a collection often creates duplicated root rows and forces Hibernate into awkward workarounds; it can work, but it’s easy to accidentally get wrong results or poor performance. With EntityGraph, you can keep a clean findAll(Pageable)-style query and let the provider fetch the requested attributes according to the graph. It’s not magic—loading a big graph still costs real SQL—but the control surface is nicer: you’re describing what to load rather than rewriting the whole query.

Here’s a concrete example using Spring Data JPA’s @EntityGraph to load items and each item’s product, while still allowing pageable queries without embedding join fetch into every JPQL string:

@Entity
class Order {
@Id Long id;

@ManyToOne(fetch = FetchType.LAZY)
Customer customer;

@OneToMany(mappedBy = "order", fetch = FetchType.LAZY)
List<orderitem> items = new ArrayList<>();
}

@Entity
class OrderItem {
@Id Long id;

@ManyToOne(fetch = FetchType.LAZY)
Order order;

@ManyToOne(fetch = FetchType.LAZY)
Product product;

int quantity;
}

public interface OrderRepository extends JpaRepository<order, long=""> {

@EntityGraph(attributePaths = {"items", "items.product"})
Page<order> findByCustomerId(Long customerId, Pageable pageable);

// Same filtering, different fetching plan:
Page<order> findByCustomerId(Long customerId, Pageable pageable);
}

What’s happening here is subtle but important. Both methods can run the same “business” query (find orders by customer) yet you can choose the fetching plan by calling the appropriate method. This is exactly the kind of flexibility that becomes painful with join fetch, because you’d typically have to duplicate JPQL queries or build dynamic JPQL strings. Also, when you later realize you only need customer but not items, EntityGraph lets you change that without touching the query logic that decides which orders to return.

Image reference (Spring Data + EntityGraph concepts are easier to remember with a diagram of “root entity + attribute nodes”):

https://www.baeldung.com/jpa-entity-graph

(Baeldung on Kotlin)

2. Question 2: What are the pros and cons of using CascadeType.ALL?

CascadeType.ALL is like giving your entity relationship a master key and then acting surprised when it opens doors you didn’t mean to unlock. The main benefit is convenience: persistence operations on the parent automatically propagate to children—persist, merge, remove, refresh, detach. That can dramatically reduce boilerplate in true “aggregate” relationships where the child’s lifecycle is fully owned by the parent. If an Order owns OrderItems and items should never exist without an order, cascading persist and remove can be exactly what you want.

The risk is that “ALL” includes operations you might not intend. The most common production horror story is accidental deletes: a developer calls orderRepository.delete(order) and suddenly items disappear too, which may be correct… unless those “items” were actually shared references, or you later changed the domain rule and forgot the mapping. Another subtle cost is merge: cascading merge across a deep graph can trigger a lot of selects/updates and make a simple update endpoint feel like it’s dragging a piano up the stairs.

A safer pattern is to cascade only what reflects your domain rules. For example, in many systems you want PERSIST and MERGE but you intentionally avoid REMOVE so deletes must be explicit (and reviewable). You also typically pair ownership semantics with orphanRemoval = true rather than relying on REMOVE for every case, because orphan removal expresses “child removed from the collection should be deleted”, which is usually closer to business intent.

@Entity
class Order {
@Id @GeneratedValue Long id;

@OneToMany(mappedBy = "order",
cascade = {CascadeType.PERSIST, CascadeType.MERGE},
orphanRemoval = true)
private List<orderitem> items = new ArrayList<>();

public void addItem(Product p, int qty) {
OrderItem item = new OrderItem(this, p, qty);
items.add(item);
}

public void removeItem(Long itemId) {
items.removeIf(i -> i.getId().equals(itemId)); // orphanRemoval deletes it
}
}

@Entity
class OrderItem {
@Id @GeneratedValue Long id;

@ManyToOne(fetch = FetchType.LAZY)
private Order order;

@ManyToOne(fetch = FetchType.LAZY)
private Product product;

private int quantity;

protected OrderItem() {}
public OrderItem(Order order, Product product, int quantity) {
this.order = order;
this.product = product;
this.quantity = quantity;
}
}

This code makes the lifecycle rule obvious: the order “owns” the items (orphan removal), and typical save flows are convenient (persist/merge cascade), but deletes are a deliberate operation on the aggregate rather than a side effect of delete(order) accidentally nuking related data. That’s the real interview-level point: cascading is not a JPA trick, it’s a domain commitment—if you wouldn’t say it confidently in plain English, you probably shouldn’t encode it as ALL.

3. Question 3: Why is batch insert/update in JPA hard?

Batch writing sounds simple until you realize JPA is not a “SQL generator”; it’s a persistence context manager with identity tracking, dirty checking, and ordering rules. The persistence context keeps every managed entity in memory to guarantee “same id = same object instance” and to compute changes. If you insert 200,000 rows naïvely in one transaction, you don’t just pay for inserts—you pay for the memory and bookkeeping of 200,000 managed instances, plus the flush-time cost of comparing snapshots, plus possible cascade traversals. That’s why “just loop and save” becomes a slow-motion memory leak disguised as a feature.

The second reason is that batching is provider-specific in practice. Hibernate can batch SQL statements, but it needs help: you must flush periodically, clear the persistence context to drop references, and configure batching settings. Updates are even trickier because dirty checking can generate unexpected SQL patterns, and versioned entities add extra constraints. On top of that, ID generation strategies matter: some strategies (like identity columns) can inhibit batching because the provider needs the generated key immediately.

A practical Hibernate-style approach looks like this: persist in chunks, flush() to send SQL, clear() to drop managed instances, and keep the transaction boundaries reasonable.

@Service
public class BulkImportService {
private final EntityManager em;

public BulkImportService(EntityManager em) {
this.em = em;
}

@Transactional
public void insertCustomers(List<customer> customers) {
int batchSize = 50;

for (int i = 0; i < customers.size(); i++) {
em.persist(customers.get(i));

if (i > 0 && i % batchSize == 0) {
em.flush(); // executes batched INSERTs
em.clear(); // prevents persistence-context memory blow-up
}
}

em.flush();
em.clear();
}
}

This example is doing two jobs at once. First, it gives Hibernate a chance to group statements (when batching is enabled) rather than issuing one round-trip per row. Second, it keeps the persistence context from growing without bound, which is the silent killer in large imports. If you’re asked “why is batch hard?”, the best answer is: because JPA’s unit of work is designed for correctness of object graphs, not for streaming raw rows, so you must actively manage the persistence context lifecycle and tune provider behavior to avoid fighting the framework.

Image reference (batch fetching/batching discussions and pitfalls are often explained alongside Hibernate batching knobs like default_batch_fetch_size / @BatchSize):

https://prasanthmathialagan.wordpress.com/2017/04/20/beware-of-hibernate-batch-fetching/

(Prasanth's personal BLOG)

4. Question 4: When should you choose WebFlux instead of Spring MVC?

Choose WebFlux when your system’s bottleneck is waiting—waiting on network I/O, streaming data, fan-out calls to other services, slow clients, server-sent events, websockets—where tying up one thread per request becomes expensive. The promise of WebFlux isn’t “it’s faster by default”; it’s “it scales differently under high concurrency because it’s built around non-blocking I/O and backpressure-friendly pipelines.” Spring itself describes MVC as the servlet, blocking I/O model, while WebFlux targets a reactive stack built for non-blocking I/O. (Home)

The trap (and interviewers love this one) is mixing WebFlux with blocking dependencies and expecting miracles. If your persistence layer is classic JPA/JDBC and your downstream SDKs block, you’ll spend time shuffling blocking work onto bounded elastic schedulers, effectively rebuilding “thread-per-request” with extra steps. That doesn’t mean it’s impossible, but it often means MVC is the more honest choice for typical CRUD services. A pragmatic rule: if your end-to-end path can be reactive (reactive DB driver, reactive messaging, reactive HTTP clients) and you have concurrency/streaming needs, WebFlux shines; if not, MVC is usually simpler and just as performant for normal loads. (Stack Overflow)

Here’s a small but real WebFlux-style controller that calls two downstream services concurrently and returns a composed response. The “value” is that you can serve many concurrent requests without pinning a thread while waiting on those remote calls.

@RestController
@RequestMapping("/api/profile")
public class ProfileController {

private final WebClient webClient;

public ProfileController(WebClient.Builder builder) {
this.webClient = builder.baseUrl("https://downstream.internal").build();
}

@GetMapping("/{userId}")
public Mono<profileview> profile(@PathVariable String userId) {
Mono<userdto> userMono = webClient.get()
.uri("/users/{id}", userId)
.retrieve()
.bodyToMono(UserDto.class);

Mono<list<orderdto>> ordersMono = webClient.get()
.uri("/orders?userId={id}", userId)
.retrieve()
.bodyToFlux(OrderDto.class)
.collectList();

return Mono.zip(userMono, ordersMono)
.map(tuple -> new ProfileView(tuple.getT1(), tuple.getT2()));
}

public record ProfileView(UserDto user, List<orderdto> orders) {}
}

This is not just “reactive syntax.” The important detail is that both HTTP calls are initiated without blocking, and the response is produced when both complete. In MVC, you can do async too, but the reactive programming model makes composition (zip/merge/backpressure) first-class, which matters most when your endpoint is inherently streaming or concurrency-heavy.

Image reference (official WebFlux overview from Spring):

https://docs.spring.io/spring-framework/reference/web/webflux/new-framework.html

(Home)

5. Question 5: How does Spring Security’s filter chain work?

Spring Security sits in front of your controllers as a chain of servlet filters. Instead of each security filter being registered directly with the servlet container, they are managed as Spring beans and orchestrated by FilterChainProxy, which is itself typically reached via DelegatingFilterProxy in the servlet container. This design matters because it allows Spring Security to choose which security filter chain applies to a given request and to keep everything consistent with the Spring ApplicationContext lifecycle. (Home)

In practical terms, a request comes in, hits the servlet container’s filter entry point, gets delegated into Spring’s springSecurityFilterChain, then FilterChainProxy selects a matching SecurityFilterChain (based on request matchers), and runs filters in order. Filters do very specific jobs: extracting credentials, establishing an Authentication, storing it in the SecurityContext, enforcing authorization, handling exceptions, and so on. If you ever need to debug “why is my request 403?”, putting a breakpoint or enabling debug logs around the proxy/chain selection is usually more productive than guessing in the controller, because your controller is often innocent—it’s just being blocked at the door. (Home)

Here’s a modern Spring Security (Boot 3 / Security 6 style) configuration that shows the chain, plus a tiny custom filter that logs the authenticated principal. The key is not the logging—it’s understanding where this runs and why order matters.

@Configuration
@EnableWebSecurity
public class SecurityConfig {

@Bean
SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
return http
.csrf(csrf -> csrf.disable())
.authorizeHttpRequests(auth -> auth
.requestMatchers("/public/**").permitAll()
.anyRequest().authenticated()
)
.httpBasic(Customizer.withDefaults())
.addFilterAfter(new PrincipalLoggingFilter(), BasicAuthenticationFilter.class)
.build();
}

static class PrincipalLoggingFilter extends OncePerRequestFilter {
@Override
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain)
throws ServletException, IOException {

Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if (auth != null && auth.isAuthenticated()) {
System.out.println("Authenticated as: " + auth.getName());
}
filterChain.doFilter(request, response);
}
}
}

What this demonstrates is: your custom filter runs inside the same filter chain as authentication/authorization filters. Placing it “after BasicAuthenticationFilter” means your log sees an established Authentication (if credentials were valid). If you put it too early, you’ll log null and wonder if Spring Security is broken—spoiler: it’s not broken, it’s just earlier in the chain.

Image reference (official Spring Security servlet architecture and chain discussion):

https://docs.spring.io/spring-security/reference/servlet/architecture.html

If you want me to extend this series with more “real-world traps” (like pagination + fetch join edge cases, Jackson + lazy proxies, or security filter ordering bugs that only appear in production), drop your questions in the comments below.

Read more at : Series of Backend Developer Interview Questions Part 8

More from this blog

T

tuanh.net

540 posts

Are you ready to elevate your Java, OOP, Spring, and DevOps skills? Look no further!