1. Introduction
Random data projection with hash-based index structures has come to be the state-of-the-art method for both an efficient near neighbor search and a scalable cluster analysis for very high-dimensional data. Data of very high dimensions pose serious computational challenge because of curse-of-dimensionality [Slide]. Random projections are also commonly used for dimension reduction. An instance of random projection, Locality sensitive hashing, has widely attracted the attention of research community and is an active area of further research. This article gives resources for detailed reading on projection method.
2. Near Neighbor Search
a) Types of queries
1. Nearest neighbor queries
2. Top-k nearest neighbors queries
3. Range queries
4. r-Near neighbor queries or ball-queries
5. Approximate nearest neighbor queries (ANN )
6. All nearest-neighbor queries or Group queries or multi-object queries
7. Reverse nearest neighbor queries
8. List of other queries
7. Reverse nearest neighbor queries
8. List of other queries
b) Hashing functions for different distance metrics
6.
c) References
2012
1. SIMP: Accurate and Efficient Near Neighbor Search in High Dimensional Spaces ( Note: Method shown to yield very efficient search result on more than 100 million 128-dimensional vectors. This is one of the largest dataset used for measuring the performance of a high-dimensional search algorithm).
2011
2010
1. Kernelized Locality-Sensitive Hashing for Scalable Image Search
2. An Algorithm and Hardware Design for Very Fast Similarity Search in High Dimensional Space
2009
1. LSB Tree: Efficient and Accurate Nearest Neighbor and Closest Pair Search in High Dimensional Space
2008
2007
2006
2005
2001
1. Random projection in dimensionality reduction: Applications to image and text data
2. Database-friendly Random Projections
3. Database-friendly random projections: Johnson-Lindenstrauss with binary coins
2000
1. Experiments with Random Projections
1. Random projection in dimensionality reduction: Applications to image and text data
2. Database-friendly Random Projections
3. Database-friendly random projections: Johnson-Lindenstrauss with binary coins
2000
1. Experiments with Random Projections
1999
1. Similarity Search in High Dimensions via Hashing
2. A Small Approximately Min-Wise Independent Family of Hash Functions
1. Similarity Search in High Dimensions via Hashing
2. A Small Approximately Min-Wise Independent Family of Hash Functions
1998
1. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality
2. Min-Wise Independent Permutations
1997
1. Two algorithms for nearest-neighbor search in high dimensions
1984
1. W.B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space
1. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality
2. Min-Wise Independent Permutations
1997
1. Two algorithms for nearest-neighbor search in high dimensions
1984
1. W.B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space
d) Implementations
3. Image Search using Random Projections and LSH
1. Random projection in dimensionality reduction: Applications to image and text data2. Efficient Near-duplicate detection and sub-image retrieval
3. Near Duplicate Image Detection: min-Hash and tf-idf Weighting
4. Geometric min-Hashing: Finding a (Thick) Needle in a Haystack
5. Caltech Work
4. Scalable Cluster Analysis using Min-Hash
1. Scalable technique for clustering the web
2. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach
3. Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering
2. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach
3. Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering
No comments:
Post a Comment