The document presents a novel method called CTVS for data extraction and alignment from web databases, focusing on automatically extracting query result records (QRRs) and aligning them into structured tables. It introduces an unsupervised duplicate detection method (UDD) that effectively identifies duplicates across multiple web databases, addressing challenges posed by non-contiguous data and nested structures. The proposed techniques demonstrate improvements in extraction accuracy by combining tag and value similarity, thereby enhancing the integration of data from various web sources.