Friday, June 8, 2012

Software Engineering Interview Questions - 1

1.  Given a time-sheet of employees entrance and exit from  an office over a period of month, provide an algorithm to find the number of employees at any given point of time.

Time-Sheet  Format
Date                    In                 Out
mm:dd:yyyy     hr:min:sec     hr:min:sec
01:01:2012     06:00:00       16:00:00
.................      .............        .............
14:01:2012    06:30:00         17:05:00
.................     ..............        ..............


Given query time   03:01:2012     00:00:00
Find all the employees in the office at the given time.




2.  Develop a method to sort an array of strings such that all the anagrams are next to each other.
     Hint: sort each string alphabetically.  Now, all the anagrams will have the same string representation. Then, either perform a lexicographical sort or use the string as key to hash in a hash-map




3.  Given an array of strings , find the first unique occurring string.
     For example let  S= {"ab","ac","ab","bd","ac","edf"} be an array of strings. The first unique string is "bd".


4.  Given an integer N, find all the prime numbers less than or equal to N.
      Hint:  Use Sieve of Eratosthenes algorithm
               


5.  Find the Nth prime number.
     Answer: No such general formula exists. There are few constrained  formulations. 
     Details:   prime number types ,  formula of primes formula of primes 2 , formula of primes 3, formula of primes 4



6.  Given a file of billion of numbers, write a program to remove the duplicates.



7. Given a matrix of 1's and 0's, write a program to find
    a) Largest square sub-matrix of only 1's
    b) Largest rectangular sub-matrix of only 1's


8.  Find the median of two sorted arrays.


9.  Find the Kth largest element in an unsorted array or find top-k elements in an array.

     ( Hint:  Selection Algorithm (median-of-median linear time algorithm) )

Thursday, June 7, 2012

Binary Tree Construction and Traversals

Tree Construction

1.  Given a sorted linked list, construct a Binary Search Tree.

2.  Given a doubly linked list, construct a Binary Search Tree and vice-a-versa.

3. Construct a  Binary Search Tree  given a preorder and an inorder traversal of the tree.

  Example:
    
   Inorder:  B D A E F C
   Preorder:  A B D C E F

   Java source code:  Main class -> TreeConstructionInorderPreorder.java
                                Other classes -> BinaryTree.java, Node.java  
   Input File: TreeTraInput.txt

Traversals

Types of Binary Tree
    1. Normal Binary Tree
    2. Binary Search Tree
    3. Threaded Binary Tree

Type of Traversals ( Wiki )

    Depth-First
         1.  Preorder
         2.  Inorder
         3.  Postorder
  
    Breadth First 
        4. Level order  
        5. Zig-Zag 
    
Methodology
    1. Recursive 
    2. Iterative using Stack
    3. Inorder tree traversal without recursion and  use of stack ( Hint : Morris Traversal, Morris Traversal 2)
    

Nearest Neighbor Search Using Data Projections and Hashing

1. Introduction


Random data projection with hash-based index structures has come to be the state-of-the-art method for both an efficient near neighbor search and a scalable cluster analysis for very high-dimensional data.  Data of very high dimensions pose serious computational challenge because of curse-of-dimensionality [Slide]. Random projections are also commonly used for dimension reduction.  An instance of random projection, Locality sensitive hashing, has widely attracted the attention of research community and is an active area of further research. This article gives resources for detailed reading on projection method.


2. Near Neighbor Search  


    a)  Types of queries

       1. Nearest neighbor queries
       2. Top-k nearest neighbors queries
       3. Range queries
       4. r-Near neighbor queries or ball-queries
       6. All nearest-neighbor queries  or Group queries or multi-object queries
       7. Reverse nearest neighbor queries
       8. List of other queries
       

   b) Hashing functions for different distance metrics

      6.

   c) References 

   
      2012
              1. SIMP: Accurate and Efficient Near Neighbor Search in High Dimensional Spaces ( Note: Method shown to yield very efficient search result on more than 100 million 128-dimensional vectors.  This is one of the largest dataset used for measuring the performance of a high-dimensional search algorithm). 
      2009

      2008
              2.  Modeling LSH for Performance Tuning
              3.  Lecture Notes

      2007

      2005
          
     2003
     2002
            1Similarity Estimation Techniques from Rounding Algorithms

     1998
           1.  Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality
           2.  Min-Wise Independent Permutations

     1997
            1. Two algorithms for nearest-neighbor search in high dimensions

     1984
            1.  W.B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space

  

  d) Implementations

          1.  Implementation of p-Stable LSH
          2.  LSHKit
          3.  SIMP
          4.  LSB


4. Scalable Cluster Analysis using Min-Hash