社内勉強会資料_Open Data Synthesis For Deep Research

Open Data Synthesis For Deep Research
#1 Paper of the day

© NABLAS Inc. All Rights Reserved 2
There is huge gap between open-sourced deep research agents and proprietary ones in terms of
accuracy
Motivation

A deep research task is a complex information seeking activity characterized by multi-layered
information dependencies.
Deep research

Good questions should not ambiguous and they should be veriﬁable
What are good questions?

the goal is to identify the unique answer set A that simultaneously satisﬁes all constraints extracted
from the question (ex: Sudoku)
If |A| is larger than 1, it means answer is ambiguous
Constraint Satisfaction Problem (CSP)

Multi-hop Problem (MHP)

Hierarchical Constraint Satisfaction Problem (HCSP)
Let’s make it hierarchical

It builds so-called research trees which are used to generate question/answer pairs
InfoSeek
Facheiroa cephaliomelana + page content
Albert Frederik Hendrik Buining + page content
A
is the son
of B A
B
Parent node
Child node
from Wikipedia
Example

Planner and Browser interactively take actions to construct research trees
InfoSeek
ACTION 1
ACTION 2 ACTION 3
ACTION 3
Initializing the root node
Setting claims and extending the tree
Generating question/answer pair (questions
should be diﬃcult enough and veriﬁable)

Example
Question
What was the public health program that was
managed by a general who was appointed by
Mario Draghi, and was mandatory for people
over 50 from 15/02/2022 to 30/06/2022?
COVID-19 vaccination in Italy
Answer
Question
What is the shape of the object featured in the
2025 episode of the television show known for
celebrity gossip and scandals that was spotted
over a clandestine military installation of the
nation known for its ﬁfty stars on its ﬂag in
2015?
Diamond-shaped
Answer
Qwen2.5-32B-Inst can answer only 2% of the questions (no tools)

Message
InfoSeeker - distillation
… **Question:** What was an
election won by a person whose
nomination was engineered by
Frank Hague?
<answer> Nguyu1ec5n Ngu1ecdc
Kiu1ec1u Duy </answer>
Response
Qwen2.5-72B → Qwen2.5-3B-Inst

GRPO
2 rounds
InfoSeeker - RL

Notably, most baselines rely heavily on large amounts of in-domain supervision (i.e., more than
100K NQ&HQA), while our approach focuses on leveraging purpose-built InfoSeek dataset for
training
Experiment

GPT-5…
Experiment

Can we apply it to diﬀerent data not like Wikipedia especially data without explicit links?
Can we apply it to generate fact-check datasets for GENIAC #3?
Discussion

社内勉強会資料_Open Data Synthesis For Deep Research

More Related Content

More from NABLAS株式会社

Recently uploaded

社内勉強会資料_Open Data Synthesis For Deep Research