일일 매칭 배치 시스템 설계 — Spring Scheduler에서 대규모 파이프라인까지

매칭이 매일 새벽에 일어나는 이유

썸블링은 매일 새벽 2시, 활성 회원 전체에 대해 일일 매칭을 생성합니다. 실시간으로 매칭하지 않는 데는 이유가 있습니다.

실시간 매칭은 사용자가 앱을 열 때마다 즉시 후보를 계산합니다. 이 방식은 응답 시간이 짧다는 장점이 있지만, 피크 타임(퇴근 후 저녁 시간대)에 모든 사용자가 동시에 앱을 열면 컴퓨팅 자원이 순간적으로 폭발합니다. 또한 실시간 방식에서는 두 사람이 서로를 동시에 후보로 볼 것이라는 보장이 없어, 비대칭 매칭 문제가 발생합니다.

배치 방식은 반대입니다. 트래픽이 가장 낮은 새벽에 집중적으로 계산을 수행하고, 결과를 데이터베이스에 저장해두면 사용자는 앱을 열 때 이미 계산된 결과를 빠르게 조회할 수 있습니다. 그리고 A의 매칭 후보에 B가 있으면 B의 후보에도 A가 있도록 대칭 매칭을 보장할 수 있습니다.

배치 아키텍처 개요

[새벽 2시 Cron 트리거]
        ↓
[1단계] 전날 PENDING 매칭 만료 처리 (벌크 UPDATE)
        ↓
[2단계] 활성 매칭 프로필 보유 회원 목록 조회
        ↓
[3단계] 회원별 매칭 파이프라인 실행 (병렬 처리)
   ├── 3-1. 필터링: 성별/나이/차단/매칭 이력 제외
   ├── 3-2. 스코어링: 코사인 유사도 계산
   └── 3-3. 선택: Top 3 후보 선정 및 DailyMatch 저장
        ↓
[4단계] 배치 결과 로깅 및 모니터링

flowchart TD
    A["⏰ 새벽 2시 Cron 트리거"] --> B["1단계: 만료 처리<br/>PENDING → EXPIRED<br/>벌크 UPDATE 1회"]
    B --> C["2단계: 활성 회원 조회<br/>MatchingProfile.isActive<br/>+ Member.status = ACTIVE<br/>+ SoulTestResult 존재"]
    C --> D["3단계: 회원별 파이프라인"]
    D --> E["필터링<br/>QueryDSL 동적 조건"]
    E --> F["스코어링<br/>Native Query 코사인 유사도"]
    F --> G["선택: Top 3<br/>다양성 보정 적용"]
    G --> H["DailyMatch 저장"]
    H --> I["4단계: 로깅 · 모니터링<br/>Slack 알림"]

    style A fill:#7F5AF0,stroke:#7F5AF0,color:#fff
    style I fill:#FF6B9D,stroke:#FF6B9D,color:#fff

전체 플로우는 MatchingScheduler → MatchingBatchService → MatchingService 순으로 책임이 위임됩니다.

Spring Scheduler 설정

aro-matching 모듈에서 @EnableScheduling을 활성화하고 Cron 표현식으로 실행 주기를 지정합니다.

// aro-matching/src/main/java/aro/matching/AroMatchingApplication.java
@SpringBootApplication
@EnableScheduling  // 스케줄러 활성화
public class AroMatchingApplication {
    public static void main(String[] args) {
        SpringApplication.run(AroMatchingApplication.class, args);
    }
}

/**
 * 일일 매칭 배치를 실행하는 스케줄러입니다.
 * 매일 새벽 2시(KST)에 자동 실행됩니다.
 *
 * @author Rojae
 */
@Slf4j
@Component
@RequiredArgsConstructor
public class MatchingScheduler {

    private final MatchingBatchService batchService;

    /**
     * 일일 매칭 배치 진입점.
     * cron 표현식: 초 분 시 일 월 요일
     * "0 0 2 * * *" = 매일 02:00:00
     */
    @Scheduled(cron = "0 0 2 * * *", zone = "Asia/Seoul")
    public void runDailyMatchingBatch() {
        log.info("[배치 시작] 일일 매칭 배치 - {}", LocalDate.now());
        long startTime = System.currentTimeMillis();

        try {
            MatchingBatchResult result = batchService.executeDailyBatch();
            long elapsed = System.currentTimeMillis() - startTime;

            log.info(
                "[배치 완료] 대상 회원: {}명, 생성된 매칭: {}건, 소요 시간: {}ms",
                result.getTotalMembers(),
                result.getTotalMatchesCreated(),
                elapsed
            );
        } catch (Exception e) {
            log.error("[배치 실패] 일일 매칭 배치 오류", e);
            // 알림 발송 (Slack webhook 또는 PagerDuty)
            alertService.sendBatchFailureAlert(e);
        }
    }

    /**
     * 관리자 수동 트리거용 메서드.
     * POST /api/admin/matches/generate 호출 시 실행됩니다.
     */
    public void triggerManually() {
        log.info("[수동 트리거] 관리자 요청으로 매칭 배치 시작");
        runDailyMatchingBatch();
    }
}

zone = "Asia/Seoul"을 명시하지 않으면 서버의 타임존(보통 UTC)을 기준으로 실행됩니다. 한국 시간 새벽 2시를 보장하려면 반드시 타임존을 지정해야 합니다.

1단계: 만료 처리 — 벌크 UPDATE

배치가 시작되면 가장 먼저 전날 이전의 PENDING 상태 매칭을 EXPIRED로 일괄 처리합니다. 개별 UPDATE가 아닌 Native Query 벌크 UPDATE 한 방으로 처리합니다.

/**
 * PENDING 상태의 만료된 매칭을 일괄 처리합니다.
 *
 * <p>Native Query 사용 이유: 수천 건을 개별 UPDATE하면
 * JPA dirty checking 비용이 발생합니다. 벌크 UPDATE로 1번 쿼리에 처리합니다.
 *
 * SQL: UPDATE daily_matches
 *      SET status = 'EXPIRED', updated_at = NOW()
 *      WHERE status = 'PENDING' AND match_date < CURRENT_DATE
 *
 * @return 처리된 건수
 * @author Rojae
 */
@Modifying(clearAutomatically = true, flushAutomatically = true)
@Transactional
@Query(
    value = """
        UPDATE daily_matches
        SET status = 'EXPIRED',
            updated_at = NOW()
        WHERE status = 'PENDING'
          AND match_date < CURRENT_DATE
        """,
    nativeQuery = true
)
int bulkExpirePendingMatches();

clearAutomatically = true는 벌크 UPDATE 후 영속성 컨텍스트의 1차 캐시를 자동으로 클리어합니다. 이것이 없으면 이후 쿼리가 캐시된 오래된 데이터를 반환할 수 있습니다.

2단계: 활성 회원 목록 조회

매칭 프로필이 활성화된 회원만 배치 대상입니다. 소울 테스트를 완료하지 않았거나, 계정이 휴면/탈퇴 상태인 회원은 제외합니다.

/**
 * 배치 대상 활성 회원 ID 목록을 조회합니다.
 * 매칭 프로필 활성화 + 회원 상태 ACTIVE + 소울 테스트 결과 존재 조건을 만족해야 합니다.
 *
 * @return 배치 대상 회원 ID 목록
 * @author Rojae
 */
@Query("""
    SELECT mp.member.id
    FROM MatchingProfile mp
    JOIN mp.member m
    WHERE mp.isActive = true
      AND m.status = 'ACTIVE'
      AND EXISTS (
          SELECT 1 FROM SoulTestResult str
          WHERE str.member.id = m.id
      )
    """)
List<Long> findActiveMemberIds();

이 쿼리의 결과가 배치의 총 처리 대상입니다. 회원 수가 10만 명이고 그 중 활성 매칭 프로필 보유자가 6만 명이라면, 이 리스트는 6만 개의 Long 값입니다. 메모리 부담을 줄이기 위해 엔티티 전체가 아닌 ID만 조회합니다.

3단계: 매칭 파이프라인

각 회원에 대해 필터링 → 스코어링 → 선택의 3단계 파이프라인이 실행됩니다.

/**
 * 특정 회원에 대한 일일 매칭을 생성합니다.
 * 필터링 → 스코어링 → Top N 선택 → 저장의 4단계로 구성됩니다.
 *
 * @param memberId 매칭 대상 회원 ID
 * @author Rojae
 */
@Transactional
public void generateDailyMatchesFor(Long memberId) {
    // Step 1: 필터링 — 후보군 좁히기
    MatchCandidateFilter filter = buildFilter(memberId);
    List<Long> candidateIds = matchingProfileRepository.findCandidateIds(filter);

    if (candidateIds.isEmpty()) {
        log.debug("매칭 후보 없음: memberId={}", memberId);
        return;
    }

    // Step 2: 스코어링 — 코사인 유사도 계산
    List<ScoredCandidate> scoredCandidates =
        matchingProfileRepository.findTopByCosineSimilarity(
            memberId, candidateIds, MAX_DAILY_MATCHES * 3  // 여유분 조회
        );

    // Step 3: 선택 — Top N 선정 (다양성 보정 적용)
    List<ScoredCandidate> selected = selectWithDiversity(
        scoredCandidates, MAX_DAILY_MATCHES
    );

    // Step 4: DailyMatch 저장
    List<DailyMatch> matches = selected.stream()
        .map(candidate -> DailyMatch.create(
            memberId,
            candidate.getMemberId(),
            candidate.getSoulScore()
        ))
        .toList();

    dailyMatchRepository.saveAll(matches);
}

private static final int MAX_DAILY_MATCHES = 3;

필터링 단계 상세

private MatchCandidateFilter buildFilter(Long memberId) {
    // 해당 회원의 매칭 선호도 조회
    MatchPreference preference = matchPreferenceRepository
        .findByMemberId(memberId)
        .orElse(MatchPreference.defaultPreference(memberId));

    // 오늘 이미 매칭된 상대 ID 목록 (중복 방지)
    List<Long> alreadyMatchedIds = dailyMatchRepository
        .findMatchedMemberIdsByMemberIdAndDate(memberId, LocalDate.now());

    // 차단 목록
    List<Long> blockedIds = memberBlockRepository
        .findBlockedIdsByBlockerId(memberId);

    return MatchCandidateFilter.builder()
        .memberId(memberId)
        .preferredGender(preference.getPreferredGender())
        .minAge(preference.getMinAge())
        .maxAge(preference.getMaxAge())
        .excludeIds(Stream.concat(
            alreadyMatchedIds.stream(),
            blockedIds.stream()
        ).toList())
        .build();
}

다양성 보정 (Diversity Correction)

단순히 Soul Score 상위 3명을 뽑으면 매일 비슷한 사람이 나올 수 있습니다. 매일 새로운 발견의 설렘을 유지하기 위해 다양성 보정을 적용합니다.

flowchart TD
    A["코사인 유사도 Top 9<br/>(여유분 조회)"] --> B["Top 1: 최고 점수<br/>최고 궁합"]
    A --> C["Top 2: 상위 30% 중<br/>OCEAN 다양성 최대<br/>새로운 자극"]
    A --> D["Top 3: 상위 50% 중<br/>랜덤 선택<br/>예측 밖의 만남"]
    B --> E["오늘의 매칭 3명"]
    C --> E
    D --> E

    style B fill:#7F5AF0,stroke:#7F5AF0,color:#fff
    style C fill:#FF6B9D,stroke:#FF6B9D,color:#fff
    style D fill:#94A1B2,stroke:#94A1B2,color:#fff

/**
 * 점수 상위 후보 중 다양성을 고려하여 최종 매칭 대상을 선정합니다.
 *
 * <p>전략:
 * - Top 1: 점수 최고인 후보 (최고 궁합)
 * - Top 2: 점수 상위권 중 첫 번째와 OCEAN 다양성이 큰 후보 (새로운 자극)
 * - Top 3: 점수 상위권 중 랜덤 선택 (예측 불가의 설렘)
 *
 * @param candidates 점수 정렬된 후보 목록
 * @param count      선정할 인원 수
 * @return 선정된 후보 목록
 * @author Rojae
 */
private List<ScoredCandidate> selectWithDiversity(
    List<ScoredCandidate> candidates,
    int count
) {
    if (candidates.size() <= count) return candidates;

    List<ScoredCandidate> result = new ArrayList<>();

    // Top 1: 최고 점수
    result.add(candidates.get(0));

    if (count >= 2 && candidates.size() > 1) {
        // Top 2: 상위 30% 중 다양성 최대 후보
        int diversityPoolSize = Math.max(2, candidates.size() * 30 / 100);
        ScoredCandidate diverse = candidates.subList(1, diversityPoolSize).stream()
            .max(Comparator.comparingDouble(c ->
                oceanDivergence(candidates.get(0), c)))
            .orElse(candidates.get(1));
        result.add(diverse);
    }

    if (count >= 3 && candidates.size() > 2) {
        // Top 3: 상위 50% 중 랜덤 (이미 선정된 후보 제외)
        Set<Long> selectedIds = result.stream()
            .map(ScoredCandidate::getMemberId)
            .collect(Collectors.toSet());

        int randomPoolSize = Math.max(3, candidates.size() * 50 / 100);
        List<ScoredCandidate> randomPool = candidates.subList(0, randomPoolSize)
            .stream()
            .filter(c -> !selectedIds.contains(c.getMemberId()))
            .toList();

        if (!randomPool.isEmpty()) {
            int randomIndex = ThreadLocalRandom.current().nextInt(randomPool.size());
            result.add(randomPool.get(randomIndex));
        }
    }

    return result;
}

private double oceanDivergence(ScoredCandidate a, ScoredCandidate b) {
    // 두 OCEAN 벡터 간의 유클리드 거리 (다양성 지표)
    double[] aOcean = a.getOceanVector();
    double[] bOcean = b.getOceanVector();
    double sum = 0;
    for (int i = 0; i < aOcean.length; i++) {
        sum += Math.pow(aOcean[i] - bOcean[i], 2);
    }
    return Math.sqrt(sum);
}

이 전략은 “오늘의 최고 궁합”, “새로운 자극”, “예상 밖의 만남” 세 가지 서로 다른 경험을 매일 제공합니다.

대규모 대응 최적화 전략

현재 구현은 활성 회원이 수만 명 수준에서 잘 작동합니다. 회원이 수십만, 수백만으로 늘어났을 때를 대비한 전략을 정리합니다.

전략 1: 병렬 처리 (CompletableFuture)

현재는 회원 목록을 순차 처리합니다. CompletableFuture와 커스텀 스레드 풀로 병렬화하면 처리 시간을 크게 줄일 수 있습니다.

@Bean("matchingBatchExecutor")
public Executor matchingBatchExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(8);
    executor.setMaxPoolSize(16);
    executor.setQueueCapacity(1000);
    executor.setThreadNamePrefix("matching-batch-");
    executor.initialize();
    return executor;
}

// 병렬 배치 처리
public MatchingBatchResult executeDailyBatch() {
    List<Long> memberIds = matchingProfileRepository.findActiveMemberIds();

    // 1000명씩 파티션 분할
    List<List<Long>> partitions = Lists.partition(memberIds, 1000);

    List<CompletableFuture<Integer>> futures = partitions.stream()
        .map(partition -> CompletableFuture.supplyAsync(
            () -> processPartition(partition),
            matchingBatchExecutor
        ))
        .toList();

    int totalCreated = futures.stream()
        .map(CompletableFuture::join)
        .mapToInt(Integer::intValue)
        .sum();

    return new MatchingBatchResult(memberIds.size(), totalCreated);
}

전략 2: 후보 사전 인덱싱 (Approximate Nearest Neighbor)

수백만 명 규모에서는 모든 쌍에 대한 코사인 유사도 계산이 불가능합니다. 이때 근사 최근접 이웃(ANN) 알고리즘을 사용합니다.

대표적인 옵션:

FAISS (Facebook AI Similarity Search): C++ 라이브러리, Java 바인딩 존재
pgvector: PostgreSQL 확장, 벡터 컬럼과 ANN 인덱스를 DB 수준에서 지원
Elasticsearch kNN: 분산 환경에서 벡터 검색

-- pgvector 활용 예시 (미래 확장 계획)
-- matching_vector를 vector 타입으로 저장
ALTER TABLE matching_profiles
ADD COLUMN embedding vector(19);  -- 19차원 (5 OCEAN + 10 values + 4 attachment)

-- IVFFlat 인덱스 생성 (근사 최근접 이웃)
CREATE INDEX idx_matching_embedding
ON matching_profiles USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- 코사인 유사도 Top N 조회 (1번 쿼리로 처리)
SELECT member_id,
       1 - (embedding <=> '[72,58,81,65,34,...]'::vector) AS soul_score
FROM matching_profiles
WHERE is_active = true
  AND member_id NOT IN (blocked_ids)
ORDER BY embedding <=> '[72,58,81,65,34,...]'::vector
LIMIT 30;

현재 썸블링은 JSONB 기반 Native Query를 사용하지만, 회원 수 10만 명 이상부터는 pgvector로의 마이그레이션을 계획하고 있습니다.

전략 3: 배치 분산 처리 (Spring Batch)

회원 수가 수백만에 달하면 단일 인스턴스에서의 배치는 새벽 2시간 안에 완료되지 않을 수 있습니다. 이때는 Spring Batch + Partitioning으로 분산 처리합니다.

@Configuration
@RequiredArgsConstructor
public class MatchingBatchConfig {

    private final JobRepository jobRepository;
    private final DataSource dataSource;

    @Bean
    public Job dailyMatchingJob() {
        return new JobBuilder("dailyMatchingJob", jobRepository)
            .start(partitionedMatchingStep())
            .build();
    }

    @Bean
    public Step partitionedMatchingStep() {
        return new StepBuilder("partitionedMatchingStep", jobRepository)
            .partitioner("matchingStep", memberIdPartitioner())
            .step(matchingStep())
            .gridSize(8)  // 8개 파티션으로 분할
            .taskExecutor(matchingBatchExecutor)
            .build();
    }

    @Bean
    public Partitioner memberIdPartitioner() {
        // 전체 회원 ID를 gridSize개로 균등 분할
        return gridSize -> {
            List<Long> allIds = findActiveMemberIds();
            int partitionSize = allIds.size() / gridSize;
            Map<String, ExecutionContext> partitions = new HashMap<>();
            for (int i = 0; i < gridSize; i++) {
                ExecutionContext ctx = new ExecutionContext();
                int from = i * partitionSize;
                int to = (i == gridSize - 1) ? allIds.size() : from + partitionSize;
                ctx.put("memberIds", allIds.subList(from, to));
                partitions.put("partition" + i, ctx);
            }
            return partitions;
        };
    }
}

배치 모니터링과 알림

배치가 실패했을 때 즉시 인지할 수 있는 체계가 필요합니다.

/**
 * 배치 실행 결과를 기록하는 이벤트 리스너입니다.
 * 실패 시 Slack 알림을 발송합니다.
 *
 * @author Rojae
 */
@Component
@RequiredArgsConstructor
@Slf4j
public class MatchingBatchEventListener {

    private final SlackNotificationService slackService;

    @EventListener
    public void onBatchCompleted(MatchingBatchCompletedEvent event) {
        MatchingBatchResult result = event.getResult();

        // 생성된 매칭이 0건이면 이상 신호
        if (result.getTotalMatchesCreated() == 0 && result.getTotalMembers() > 0) {
            slackService.sendAlert(
                "⚠️ 일일 매칭 배치 이상: 대상 회원 " + result.getTotalMembers()
                + "명인데 생성된 매칭이 0건입니다."
            );
        }

        // 처리 시간이 1시간 초과 시 경고
        if (result.getElapsedMs() > 3_600_000) {
            slackService.sendAlert(
                "⏱️ 일일 매칭 배치 지연: " +
                (result.getElapsedMs() / 60_000) + "분 소요됨"
            );
        }

        log.info("배치 결과 — 대상: {}명, 생성: {}건, 소요: {}ms",
            result.getTotalMembers(),
            result.getTotalMatchesCreated(),
            result.getElapsedMs()
        );
    }

    @EventListener
    public void onBatchFailed(MatchingBatchFailedEvent event) {
        slackService.sendAlert(
            "🚨 일일 매칭 배치 실패: " + event.getException().getMessage()
        );
    }
}

실제 배치 성능 지표

내부 테스트 환경(PostgreSQL 16, 8 vCPU, 16GB RAM)에서 측정한 결과입니다.

활성 회원 수	처리 방식	소요 시간
1,000명	순차 처리	8초
10,000명	순차 처리	84초
10,000명	8 스레드 병렬	12초
50,000명	8 스레드 병렬	58초
100,000명	8 스레드 병렬 (예상)	~120초
100,000명	pgvector + 병렬 (예상)	~30초

새벽 2시에 시작하여 2시간 이내에 완료되어야 한다는 목표를 기준으로, 현재 아키텍처는 약 30만 명까지 안정적으로 처리 가능합니다. 그 이상에서는 pgvector 도입과 분산 배치가 필요합니다.

마치며

일일 매칭 배치는 썸블링의 심장입니다. 사용자가 아침에 앱을 열었을 때 오늘의 매칭이 기다리고 있는 경험, 그 경험을 매일 안정적으로 제공하는 것이 이 시스템의 목표입니다.

현재 구현은 Spring Scheduler + QueryDSL 필터링 + Native Query 코사인 유사도의 조합으로 충분히 잘 작동하고 있습니다. 사용자 수가 늘어남에 따라 병렬 처리, pgvector, Spring Batch Partitioning을 단계적으로 도입하는 로드맵을 갖고 있습니다.

배치 시스템은 눈에 보이지 않는 인프라지만, 사용자 경험의 가장 근본적인 부분을 결정합니다. “오늘 나에게 맞는 사람이 있을까?”라는 설렘에 매일 신뢰할 수 있는 답을 제공하는 것, 그것이 이 시스템이 존재하는 이유입니다.