This document presents a new approach called "Host-IP cluster random sampling" to more accurately estimate the size of the deep web. Previous methods like IP address random sampling were inaccurate due to issues like virtual hosting. The new approach resolves hostnames to IP addresses, clusters hosts by IP, then randomly samples the IPs. When applied to a dataset of 670,000 Russian web hosts, it found approximately 14,200 deep web sites, while IP sampling found only around 4,000, underestimating by a factor of 3.5. The Host-IP method provides more reliable characterization of the national deep web size compared to prior techniques.