Technical IT Interview Questions

Design a system to efficiently calculate the top 1MM Google search queries and create a report of these. Additionally:

  • You are given twelve servers
  • Each has two processors, 4GB of ram and four 400GB hard drives.
  • The machines are networked
  • The log data as roughly 100 Billion log lines in it.
  • The log data comes in twelve, 320 Gb files.
  • Each line of the files has roughly 40 search queries
  • You can only use open source software or software that you write.