Popular content in video sharing web sites (e.g., YouTube) is usually duplicated. Most scholars define near-duplicate video clips (NDVC) based on non-semantic features (e.g., different image/audio quality), while a few also include semantic features (different videos of similar content). However, it is unclear what features contribute to the human perception of similar videos. Findings of two large scale online surveys (N = 1003) confirm the relevance of both types of features. While some of our findings confirm the adopted definitions of NDVC, other findings are surprising. For example, videos that vary in visual content –by overlaying or inserting additional information– may not be perceived as near-duplicate versions of the original videos. Conversely, two different videos with distinct sounds, people, and scenarios were considered to be NDVC because they shared the same semantics (none of the pairs had additional information). Furthermore, the exact role played by semantics in relation to the features that make videos alike is still an open question. In most cases, participants preferred to see only one of the NDVC in the search results of a video search query and they were more tolerant to changes in the audio than in the video tracks. Finally, we propose a user-centric NDVC definition and present implications for how duplicate content should be dealt with by video sharing websites.