Locality Sensitive Hashing, Jaccard Similarity, Duplicate Detection, Document Clustering